Inventing contact schemas for fun and profit! (Ugh)

And then there were three.

Today, Yahoo! announced the public availability of their own Address Book API. Though Plaxo and LinkedIn have been using this API behind the scenes for a short while, today marks the first time the API is available for anyone who registers for an App ID to make use of the bi-directional protocol.

The API is shielded behind Yahoo! proprietary BBAuth protocol, which obviates the need to request Yahoo! member credentials at the time of import initiation, as seen in this screenshot from LinkedIn (from April):

LinkedIn: Expand your network

Now, like Joseph, I applaud the release of this API, as it provides one more means for individuals to have utter control and access to their friends, colleagues and contacts using a robust protocol.

However, I have to lament yet more needless reinvention of contact schema. Why is this a problem? Well, as I pointed out about Facebook’s approach to developing their own platform methods and formats, having to write and debug against yet another contact schema makes the “tax” of adding support for contact syncing and export increasingly onerous for sites and web services that want to better serve their customers by letting them host and maintain their address book elsewhere.

This isn’t just a problem that I have with Yahoo!. It’s something that I encountered last November with the SREG and proposed Attribute Exchange profile definition. And yet again when Google announced their Contacts API. And then again when Microsoft released theirs! Over and over again we’re seeing better ways of fighting the password anti-pattern flow of inviting friends to new social services, but having to implement support for countless contact schemas. What we need is one common contacts interchange format and I strongly suggest that it inherit from vcard with allowances or extension points for contemporary trends in social networking profile data.

I’ve gone ahead and whipped up a comparison matrix between the primary contact schemas to demonstrate the mess we’re in.

Below, I have a subset of the complete matrix to give you a sense for where we’re at with OpenSocial (né GData), Yahoo Address Book API and Microsoft’s Windows Live Contacts API, and include vcard (RFC 2426) as the cardinal format towards which subsequent schemas should converge:

vcard OpenSocial 0.8 Windows Live Contacts API Yahoo Address Book API
UID uid url id cid cid
Nickname nickname nickname NickName nickname
Full Name n or fn name NameTitle, FirstName, MiddleName, LastName, Suffix name
First name n (given-name) given_name FirstName name (first)
Last name n (family-name) family_name LastName name (last)
Birthday bday date_of_birth Birthdate birthday (day, month, year)
Anniversary Anniversary anniversary (day, month, year)
Gender gender gender gender
Email email email Email (ID, EmailType, Address, IsIMEnabled, IsDefault) email
Street street-address street-address StreetLine street
Postal Code postal-code postal-code PostalCode zip
City locality locality
State region region PrimaryCity state
Country country-name country CountryRegion country
Latitutude geo (latitude) latitude latitude
Longitude geo (longitude) longitude longitude
Language N/A N/A
Phone tel (type, value) phone (number, type) Phone (ID, PhoneType, Number, IsIMEnabled, IsDefault) phone
Timezone tz time_zone TimeZone N/A
Photo photo thumbnail_url N/A
Company org organization.name CompanyName company
Job Title title, role organization.title JobTitle jobtitle
Biography note about_me notes
URL url url URI (ID, URIType, Name, Address) link
Category category, rel-tag tags Tag (ID, Name, ContactIDs)

12 Comments

  1. Todd said
    at 5am on Jun 5th # |

    Those who can do, those who can’t file for a patent, send cease & desists, ignore open standards and re-invent the wheel?

  2. at 6am on Jun 5th # |

    Well said, Chris! At Plaxo, we’ve had to implement and maintain support for all these disparate schemas (and more–outlook, mac, tbird, aim, etc.) so I’m painfully aware of the “tax” you’re talking about. It not only impacts developers (who have to do a lot more work), it in turn hurts users, because the empirical result is that less things talk to less other things when the cost of doing so is high, and in particular when the marginal cost of supporting one-more-source doesn’t ever go down. If only someone would develop a definitive open standard for sharing contact info… ;)

  3. at 7am on Jun 5th # |

    Totally agree. Better yet: why not just send *actual* vcf or hcard data in response to the API, instead of custom XML/ATOM/RSS/JSON ?

  4. D. Lambert said
    at 8am on Jun 5th # |

    The real measuring stick here should be CRM systems. These systems have been storing contacts forever, and their schemas are generally deeper than any of these (multiple addresses, phone numbers, emails, etc).

    I’d love to see a real standard emerge, but I believe it’s not going to gain wide acceptance if the schema can’t support existing data well.

    I’d also like to see a field to indicate the “native” system for this user. Between that and the UID, I should be able to go back and find the original contact record in the original system of record whenever I want.

  5. at 9am on Jun 5th # |

    @Stephen: good question. I think it’s to simplify parsing and to also (in the case of hcard) make the data more lightweight. The idea is that you can go back and forth from a vcard-based protocol to hcard and so on… but we need something that looks more like an API to gain widespread adoption in various contexts (i.e. mobile, desktop)… We also need a two-way syncing protocol, and hcard just really isn’t up to that task (or at least I’ve never seen anyone do it).

    @D. Lambert: I disagree. If anyone has an interest in having proprietary protocols and formats, it’s CRMs that need to keep you in their silos. On the open web, we’re seeing so many people wanting to connect with their friends between social networks (and inviting them from their address books) that we need something that works with those kinds of generic profiles. What we *don’t* need is a lot of expressiveness at this stage; instead we just need to standardize on the 80% use cases and get agreement (and then adoption there) and then we can talk about expanding into other attributes.

    Again, the beauty of standards is that there are so many to choose from! Getting adoption around a core number of attributes is the present goal.

  6. at 9am on Jun 5th # |

    URL is url in OpenSocial – profile_url is a special case field. OpenSocial did start with vcard, but grew based on converging fields from multiple existing social networks, and clarifying the missing or underspecified parts of vcard.

  7. at 9am on Jun 5th # |

    @Kevin Marks: was this research or competing schemas ever made public? This would be really valuable for comparison’s sake. Obviously convergence is what we’re aiming for here, but I’d love, for example, to be able to use this API inside of Address Book.app, in which case vcard-compatibility would be key.

    Thoughts?

  8. at 4pm on Jun 7th # |

    Hey Chris,

    Nice breakdown!

    I will have to work with all these APIs at some point soon as people using what I am building will likely span all services (though I only really use one).

    While I could simply focus on the initial service I am interested in and then write code for each of the next ones I think it makes sense to create a read/write facade so I only need to deal with one API.

    To keep it cross-platform and tool independent (most of my stuff is in C) I was thiking of creating a stylesheet with the physical implemention (how/when) of the transform left up to the developer. I could use the structure offered by OpenSocial as a baseline…

    Does this make sense? Do you think this (the mapping stylesheet) would be something useful for the community? If so once done I could release it out into the wild for others to use.

    Cheers,

    Christopher

  9. zooper said
    at 10pm on Jun 9th # |

    Well, this seems like a perfectly good place to propose which of the available formats are worth sitting down and evaluating.

    Should we draw up needs or get suggestions for the best way forward using existing solutions (but prepared to shift if it is warranted)

  10. at 2pm on Jun 12th # |

    @joseph:

    Any chance that you’d be able to open-source the code so we don’t all have to pay the “tax”? :D

  11. at 7pm on Jun 22nd # |

    You should also include the W3C’s Contact schema, which has been around for almost eight years and is used in swathes of Semantic Web data. Following on from work by Norm Walsh, they also have some notes on modelling vCards in RDF.

  12. at 6am on Jul 9th # |

    The thing is, vcard is a pretty hopeless starting point. There’s so much ambiguity in the spec and so much that it just doesn’t model properly that every application has to invent its own mechanism for representing its internal schema using vcard. As a result success rates moving contacts between applications using vcard are pretty low in my experience. I think the multiple schemas you’re seeing are a response to the vacuum that vcard has created by giving the impression of being a workable standard without actually being capable of solving the problems.

    To give some concrete examples,
    * The usual user-visible phone number and email address types (home, work, mobile, etc) aren’t mapped onto vcard type values in a standard way.
    * The various parts of the ADR field are not standardised and don’t accurately capture international addresses. The *order* that the parts should be displayed in isn’t even consistent across applications.
    * There’s no agreement about how to represent the contact data for an organization (as distinct to an individual).

    I mean, just have a look at http://microformats.org/wiki/vcard-implementations

    The last time I looked at this stuff I came away amazed that any non-technical users manage to have any kind of success with this kind of thing. I was trying to write a script to merge together the contacts from my Series 60 Nokia phone, Evolution on Linux, Outlook and GMail, all exported in vcard format, and finished up with a fat ball of heuristics that probably only worked for the few hundred cards I was working with.

    vcard is probably fine if you define the 80% use-case as ‘extract a list of email addresses we can spam’, but as the basis for an API it falls well short.

7 Trackbacks

  1. […] Meanwhile Chris Messina has written up what it’s gonna take to create a standard for contact s… Right on to Chris – again.  He’s building on his original work on AXshema – for the Attribute Exchange.   Chris also brings up the importance of bi-directional protocols (which I call two-way APIs.)  Now lets see if Chris’ schema can work with the effort Joseph Smarr is pushing – behind a standard approach to contacts sync. […]

  2. […] contacts on third party sites that don’t require you to hand over your credentials. Once we standardize on a basic contact schema, it will only require adoption and implementation to obviate this insecure practice. This entry […]

  3. […] Inventing contact schemas for fun and profit! (Ugh) (tags: citizen-centric web digital identity life online microformats technology building diso address book api bbauth opensocial portable contacts vcard windows live yahoo) […]

  4. […] Inventing contact schemas for fun and profit! (Ugh): Chris Messina writes about the potential benefits that the recently released contact APIs from Microsoft, Google, and now Yahoo! as well as the problems caused by the proliferation of new contact schemas. Messina argues that the schemas should be standardized where possible using an existing standard such as vCard. […]

  5. […] 13, 2008 So after reading an informative post on Chris Messina’s blog I think I am going to put together an xml based mapping for the […]

  6. […] Inventing contact schemas for fun and profit! (Ugh) […]

  7. […] service provider has invented a different proprietary protocol for doing the same task in his post Inventing contact schemas for fun and profit! (Ugh) where he […]