Inventing contact schemas for fun and profit! (Ugh)

And then there were three.

Today, Yahoo! announced the public availability of their own Address Book API. Though Plaxo and LinkedIn have been using this API behind the scenes for a short while, today marks the first time the API is available for anyone who registers for an App ID to make use of the bi-directional protocol.

The API is shielded behind Yahoo! proprietary BBAuth protocol, which obviates the need to request Yahoo! member credentials at the time of import initiation, as seen in this screenshot from LinkedIn (from April):

LinkedIn: Expand your network

Now, like Joseph, I applaud the release of this API, as it provides one more means for individuals to have utter control and access to their friends, colleagues and contacts using a robust protocol.

However, I have to lament yet more needless reinvention of contact schema. Why is this a problem? Well, as I pointed out about Facebook’s approach to developing their own platform methods and formats, having to write and debug against yet another contact schema makes the “tax” of adding support for contact syncing and export increasingly onerous for sites and web services that want to better serve their customers by letting them host and maintain their address book elsewhere.

This isn’t just a problem that I have with Yahoo!. It’s something that I encountered last November with the SREG and proposed Attribute Exchange profile definition. And yet again when Google announced their Contacts API. And then again when Microsoft released theirs! Over and over again we’re seeing better ways of fighting the password anti-pattern flow of inviting friends to new social services, but having to implement support for countless contact schemas. What we need is one common contacts interchange format and I strongly suggest that it inherit from vcard with allowances or extension points for contemporary trends in social networking profile data.

I’ve gone ahead and whipped up a comparison matrix between the primary contact schemas to demonstrate the mess we’re in.

Below, I have a subset of the complete matrix to give you a sense for where we’re at with OpenSocial (nĂ© GData), Yahoo Address Book API and Microsoft’s Windows Live Contacts API, and include vcard (RFC 2426) as the cardinal format towards which subsequent schemas should converge:

vcard OpenSocial 0.8 Windows Live Contacts API Yahoo Address Book API
UID uid url id cid cid
Nickname nickname nickname NickName nickname
Full Name n or fn name NameTitle, FirstName, MiddleName, LastName, Suffix name
First name n (given-name) given_name FirstName name (first)
Last name n (family-name) family_name LastName name (last)
Birthday bday date_of_birth Birthdate birthday (day, month, year)
Anniversary Anniversary anniversary (day, month, year)
Gender gender gender gender
Email email email Email (ID, EmailType, Address, IsIMEnabled, IsDefault) email
Street street-address street-address StreetLine street
Postal Code postal-code postal-code PostalCode zip
City locality locality
State region region PrimaryCity state
Country country-name country CountryRegion country
Latitutude geo (latitude) latitude latitude
Longitude geo (longitude) longitude longitude
Language N/A N/A
Phone tel (type, value) phone (number, type) Phone (ID, PhoneType, Number, IsIMEnabled, IsDefault) phone
Timezone tz time_zone TimeZone N/A
Photo photo thumbnail_url N/A
Company org organization.name CompanyName company
Job Title title, role organization.title JobTitle jobtitle
Biography note about_me notes
URL url url URI (ID, URIType, Name, Address) link
Category category, rel-tag tags Tag (ID, Name, ContactIDs)

19 thoughts on “Inventing contact schemas for fun and profit! (Ugh)”

  1. Those who can do, those who can’t file for a patent, send cease & desists, ignore open standards and re-invent the wheel?

  2. Well said, Chris! At Plaxo, we’ve had to implement and maintain support for all these disparate schemas (and more–outlook, mac, tbird, aim, etc.) so I’m painfully aware of the “tax” you’re talking about. It not only impacts developers (who have to do a lot more work), it in turn hurts users, because the empirical result is that less things talk to less other things when the cost of doing so is high, and in particular when the marginal cost of supporting one-more-source doesn’t ever go down. If only someone would develop a definitive open standard for sharing contact info… ;)

  3. Totally agree. Better yet: why not just send *actual* vcf or hcard data in response to the API, instead of custom XML/ATOM/RSS/JSON ?

  4. The real measuring stick here should be CRM systems. These systems have been storing contacts forever, and their schemas are generally deeper than any of these (multiple addresses, phone numbers, emails, etc).

    I’d love to see a real standard emerge, but I believe it’s not going to gain wide acceptance if the schema can’t support existing data well.

    I’d also like to see a field to indicate the “native” system for this user. Between that and the UID, I should be able to go back and find the original contact record in the original system of record whenever I want.

  5. @Stephen: good question. I think it’s to simplify parsing and to also (in the case of hcard) make the data more lightweight. The idea is that you can go back and forth from a vcard-based protocol to hcard and so on… but we need something that looks more like an API to gain widespread adoption in various contexts (i.e. mobile, desktop)… We also need a two-way syncing protocol, and hcard just really isn’t up to that task (or at least I’ve never seen anyone do it).

    @D. Lambert: I disagree. If anyone has an interest in having proprietary protocols and formats, it’s CRMs that need to keep you in their silos. On the open web, we’re seeing so many people wanting to connect with their friends between social networks (and inviting them from their address books) that we need something that works with those kinds of generic profiles. What we *don’t* need is a lot of expressiveness at this stage; instead we just need to standardize on the 80% use cases and get agreement (and then adoption there) and then we can talk about expanding into other attributes.

    Again, the beauty of standards is that there are so many to choose from! Getting adoption around a core number of attributes is the present goal.

  6. URL is url in OpenSocial – profile_url is a special case field. OpenSocial did start with vcard, but grew based on converging fields from multiple existing social networks, and clarifying the missing or underspecified parts of vcard.

  7. @Kevin Marks: was this research or competing schemas ever made public? This would be really valuable for comparison’s sake. Obviously convergence is what we’re aiming for here, but I’d love, for example, to be able to use this API inside of Address Book.app, in which case vcard-compatibility would be key.

    Thoughts?

  8. Hey Chris,

    Nice breakdown!

    I will have to work with all these APIs at some point soon as people using what I am building will likely span all services (though I only really use one).

    While I could simply focus on the initial service I am interested in and then write code for each of the next ones I think it makes sense to create a read/write facade so I only need to deal with one API.

    To keep it cross-platform and tool independent (most of my stuff is in C) I was thiking of creating a stylesheet with the physical implemention (how/when) of the transform left up to the developer. I could use the structure offered by OpenSocial as a baseline…

    Does this make sense? Do you think this (the mapping stylesheet) would be something useful for the community? If so once done I could release it out into the wild for others to use.

    Cheers,

    Christopher

  9. Well, this seems like a perfectly good place to propose which of the available formats are worth sitting down and evaluating.

    Should we draw up needs or get suggestions for the best way forward using existing solutions (but prepared to shift if it is warranted)

  10. The thing is, vcard is a pretty hopeless starting point. There’s so much ambiguity in the spec and so much that it just doesn’t model properly that every application has to invent its own mechanism for representing its internal schema using vcard. As a result success rates moving contacts between applications using vcard are pretty low in my experience. I think the multiple schemas you’re seeing are a response to the vacuum that vcard has created by giving the impression of being a workable standard without actually being capable of solving the problems.

    To give some concrete examples,
    * The usual user-visible phone number and email address types (home, work, mobile, etc) aren’t mapped onto vcard type values in a standard way.
    * The various parts of the ADR field are not standardised and don’t accurately capture international addresses. The *order* that the parts should be displayed in isn’t even consistent across applications.
    * There’s no agreement about how to represent the contact data for an organization (as distinct to an individual).

    I mean, just have a look at http://microformats.org/wiki/vcard-implementations

    The last time I looked at this stuff I came away amazed that any non-technical users manage to have any kind of success with this kind of thing. I was trying to write a script to merge together the contacts from my Series 60 Nokia phone, Evolution on Linux, Outlook and GMail, all exported in vcard format, and finished up with a fat ball of heuristics that probably only worked for the few hundred cards I was working with.

    vcard is probably fine if you define the 80% use-case as ‘extract a list of email addresses we can spam’, but as the basis for an API it falls well short.

Comments are closed.