Inventing contact schemas for fun and profit! (Ugh)

And then there were three.

Today, Yahoo! announced the public availability of their own Address Book API. Though Plaxo and LinkedIn have been using this API behind the scenes for a short while, today marks the first time the API is available for anyone who registers for an App ID to make use of the bi-directional protocol.

The API is shielded behind Yahoo! proprietary BBAuth protocol, which obviates the need to request Yahoo! member credentials at the time of import initiation, as seen in this screenshot from LinkedIn (from April):

Now, like Joseph, I applaud the release of this API, as it provides one more means for individuals to have utter control and access to their friends, colleagues and contacts using a robust protocol.

However, I have to lament yet more needless reinvention of contact schema. Why is this a problem? Well, as I pointed out about Facebook’s approach to developing their own platform methods and formats, having to write and debug against yet another contact schema makes the “tax” of adding support for contact syncing and export increasingly onerous for sites and web services that want to better serve their customers by letting them host and maintain their address book elsewhere.

This isn’t just a problem that I have with Yahoo!. It’s something that I encountered last November with the SREG and proposed Attribute Exchange profile definition. And yet again when Google announced their Contacts API. And then again when Microsoft released theirs! Over and over again we’re seeing better ways of fighting the password anti-pattern flow of inviting friends to new social services, but having to implement support for countless contact schemas. What we need is one common contacts interchange format and I strongly suggest that it inherit from vcard with allowances or extension points for contemporary trends in social networking profile data.

I’ve gone ahead and whipped up a comparison matrix between the primary contact schemas to demonstrate the mess we’re in.

Below, I have a subset of the complete matrix to give you a sense for where we’re at with OpenSocial (né GData), Yahoo Address Book API and Microsoft’s Windows Live Contacts API, and include vcard (RFC 2426) as the cardinal format towards which subsequent schemas should converge:

	vcard	OpenSocial 0.8	Windows Live Contacts API	Yahoo Address Book API
UID	uid url	id	cid	cid
Nickname	nickname	nickname	NickName	nickname
Full Name	n or fn	name	NameTitle, FirstName, MiddleName, LastName, Suffix	name
First name	n (given-name)	given_name	FirstName	name (first)
Last name	n (family-name)	family_name	LastName	name (last)
Birthday	bday	date_of_birth	Birthdate	birthday (day, month, year)
Anniversary			Anniversary	anniversary (day, month, year)
Gender	gender	gender	gender
Email	email	email	Email (ID, EmailType, Address, IsIMEnabled, IsDefault)	email
Street	street-address	street-address	StreetLine	street
Postal Code	postal-code	postal-code	PostalCode	zip
City	locality	locality
State	region	region	PrimaryCity	state
Country	country-name	country	CountryRegion	country
Latitutude	geo (latitude)	latitude	latitude
Longitude	geo (longitude)	longitude	longitude
Language	N/A			N/A
Phone	tel (type, value)	phone (number, type)	Phone (ID, PhoneType, Number, IsIMEnabled, IsDefault)	phone
Timezone	tz	time_zone	TimeZone	N/A
Photo	photo	thumbnail_url		N/A
Company	org	organization.name	CompanyName	company
Job Title	title, role	organization.title	JobTitle	jobtitle
Biography	note	about_me		notes
URL	url	url	URI (ID, URIType, Name, Address)	link
Category	category, rel-tag	tags	Tag (ID, Name, ContactIDs)

Author: Chris Messina

Inventor of the hashtag. #1 Product Hunter. Techmeme Ride Home podcaster. Ever-curious product designer and technologist. Previously: Google, Uber, Republic, YC W'18. View all posts by Chris Messina

19 thoughts on “Inventing contact schemas for fun and profit! (Ugh)”

Todd says:

Jun 5th at @ 5am

Those who can do, those who can’t file for a patent, send cease & desists, ignore open standards and re-invent the wheel?
Joseph Smarr says:

Jun 5th at @ 6am

Well said, Chris! At Plaxo, we’ve had to implement and maintain support for all these disparate schemas (and more–outlook, mac, tbird, aim, etc.) so I’m painfully aware of the “tax” you’re talking about. It not only impacts developers (who have to do a lot more work), it in turn hurts users, because the empirical result is that less things talk to less other things when the cost of doing so is high, and in particular when the marginal cost of supporting one-more-source doesn’t ever go down. If only someone would develop a definitive open standard for sharing contact info… 😉
Stephen Paul Weber says:

Jun 5th at @ 7am

Totally agree. Better yet: why not just send *actual* vcf or hcard data in response to the API, instead of custom XML/ATOM/RSS/JSON ?
D. Lambert says:

Jun 5th at @ 8am

The real measuring stick here should be CRM systems. These systems have been storing contacts forever, and their schemas are generally deeper than any of these (multiple addresses, phone numbers, emails, etc).

I’d love to see a real standard emerge, but I believe it’s not going to gain wide acceptance if the schema can’t support existing data well.

I’d also like to see a field to indicate the “native” system for this user. Between that and the UID, I should be able to go back and find the original contact record in the original system of record whenever I want.
Chris Messina says:

Jun 5th at @ 9am

@Stephen: good question. I think it’s to simplify parsing and to also (in the case of hcard) make the data more lightweight. The idea is that you can go back and forth from a vcard-based protocol to hcard and so on… but we need something that looks more like an API to gain widespread adoption in various contexts (i.e. mobile, desktop)… We also need a two-way syncing protocol, and hcard just really isn’t up to that task (or at least I’ve never seen anyone do it).

@D. Lambert: I disagree. If anyone has an interest in having proprietary protocols and formats, it’s CRMs that need to keep you in their silos. On the open web, we’re seeing so many people wanting to connect with their friends between social networks (and inviting them from their address books) that we need something that works with those kinds of generic profiles. What we *don’t* need is a lot of expressiveness at this stage; instead we just need to standardize on the 80% use cases and get agreement (and then adoption there) and then we can talk about expanding into other attributes.

Again, the beauty of standards is that there are so many to choose from! Getting adoption around a core number of attributes is the present goal.
Kevin Marks says:

Jun 5th at @ 9am

URL is url in OpenSocial – profile_url is a special case field. OpenSocial did start with vcard, but grew based on converging fields from multiple existing social networks, and clarifying the missing or underspecified parts of vcard.
Chris Messina says:

Jun 5th at @ 9am

@Kevin Marks: was this research or competing schemas ever made public? This would be really valuable for comparison’s sake. Obviously convergence is what we’re aiming for here, but I’d love, for example, to be able to use this API inside of Address Book.app, in which case vcard-compatibility would be key.

Thoughts?
Pingback: Marc’s Voice » Blog Archive » June 5th blogging - '08
Pingback: An opportunity for OAuth: Jeff “CodingHorror” Atwood highlights the password anti-pattern « OAuth
Pingback: links for 2008-06-06 « Breyten’s Dev Blog
Christopher says:

Jun 7th at @ 4pm

Hey Chris,

Nice breakdown!

I will have to work with all these APIs at some point soon as people using what I am building will likely span all services (though I only really use one).

While I could simply focus on the initial service I am interested in and then write code for each of the next ones I think it makes sense to create a read/write facade so I only need to deal with one API.

To keep it cross-platform and tool independent (most of my stuff is in C) I was thiking of creating a stylesheet with the physical implemention (how/when) of the transform left up to the developer. I could use the structure offered by OpenSocial as a baseline…

Does this make sense? Do you think this (the mapping stylesheet) would be something useful for the community? If so once done I could release it out into the wild for others to use.

Cheers,

Christopher
zooper says:

Jun 9th at @ 10pm

Well, this seems like a perfectly good place to propose which of the available formats are worth sitting down and evaluating.

Should we draw up needs or get suggestions for the best way forward using existing solutions (but prepared to shift if it is warranted)
Brad Hafichuk says:

Jun 12th at @ 2pm

@joseph:

Any chance that you’d be able to open-source the code so we don’t all have to pay the “tax”? 😀
Pingback: On Message with Ben Gross » Blog Archive » Link roundup for 6/13/08
Pingback: Contact Mapping Schema « Social Interaction
Earle Martin says:

Jun 22nd at @ 7pm

You should also include the W3C’s Contact schema, which has been around for almost eight years and is used in swathes of Semantic Web data. Following on from work by Norm Walsh, they also have some notes on modelling vCards in RDF.
Pingback: Nodalities » Blog Archive » This Week’s Semantic Web
Pingback: Dare Obasanjo aka Carnage4Life - Some Thoughts on Google Adopting OAuth for GData APIs
Mark Wilkinson says:

Jul 9th at @ 6am

The thing is, vcard is a pretty hopeless starting point. There’s so much ambiguity in the spec and so much that it just doesn’t model properly that every application has to invent its own mechanism for representing its internal schema using vcard. As a result success rates moving contacts between applications using vcard are pretty low in my experience. I think the multiple schemas you’re seeing are a response to the vacuum that vcard has created by giving the impression of being a workable standard without actually being capable of solving the problems.

To give some concrete examples,
* The usual user-visible phone number and email address types (home, work, mobile, etc) aren’t mapped onto vcard type values in a standard way.
* The various parts of the ADR field are not standardised and don’t accurately capture international addresses. The *order* that the parts should be displayed in isn’t even consistent across applications.
* There’s no agreement about how to represent the contact data for an organization (as distinct to an individual).

I mean, just have a look at http://microformats.org/wiki/vcard-implementations

The last time I looked at this stuff I came away amazed that any non-technical users manage to have any kind of success with this kind of thing. I was trying to write a script to merge together the contacts from my Series 60 Nokia phone, Evolution on Linux, Outlook and GMail, all exported in vcard format, and finished up with a fat ball of heuristics that probably only worked for the few hundred cards I was working with.

vcard is probably fine if you define the 80% use-case as ‘extract a list of email addresses we can spam’, but as the basis for an API it falls well short.