Feature request: OAuth in WordPress

Twitter / photomatt: @factoryjoe I would like OA...

In the past couple days, there’s been a bit of a dust-up about some changes coming to WordPress in 2.6 — namely disabling ATOM and XML-RPC APIs by default.

The argument is that this will make WordPress more secure out of the box — but the question is at what cost? And, is there a better solution to this problem rather than disabling features and functionality (even if only a small subset of users currently make use of these APIs) if the changes end up being short-sighted?

This topic hit the wp-xmlrpc mailing list where the conversation quickly devolved into spattering about SSL and other security related topics.

Allan Odgaard (creator TextMate, as far as I can tell!) even proposed inventing another authorization protocol.

Sigh.

There are a number of reasons why WordPress should adopt OAuth — and not just because we’re going to require it for DiSo.

Heck, Stephen Paul Weber already got OAuth + AtomPub working for WordPress, and has completed a basic OAuth plugin for WordPress. The pieces are nearly in place, not to mention the fact that OAuth will pretty much be essential if WordPress is going to adopt OpenID at some point down the road. It’s also going to be quite useful if folks want to post from, say, a Google Gadget or OpenSocial application (or similar) to a WordPress blog if the XML-RPC APIs are going to be off by default (given Google’s wholesale embrace of OAuth).

Now, fortunately, folks within Automattic are supportive of OAuth, including Matt and Lloyd.

There are plenty of benefits to going down this path, not to mention the ability to scope third party applications to certain permissions — like letting Facebook see your private posts but not edit or create new ones — or authorizing desktop applications to post new entries or upload photos or videos without having to remember your username and password (instead you’d type in your blog address — and it would discover the authorization endpoints using XRDS-SimpleEran has more on discovery: Magic, People vs. Machines).

Anyway, WordPress and OAuth are natural complements, and with popular support and momentum behind the protocol, it’s tragic to see needless reinvention when so many modern applications have the same problem of delegated authorization.

I see this is a tremendous opportunity for both WordPress and OAuth and am looking forward to discussing this opportunity — at least consideration for WordPress 2.7 — and tonight’s meetup — for which I’m now late! Doh!

The Social Web TV pilot episode

http://www.viddler.com/player/2cf46be8/

My buddies John McCrea, Joseph Smarr have started up a show called The Social Web and have released the pilot episode, featuring David Recordon on the hubbub between Google and Facebook following last week’s Supernova Conference.

As they point out, things are changing and happening so fast in the industry that a show like this, that cuts through the FUD and marketing hype is really necessary. I hope to participate in future episodes — and would love to hear suggestions or recommendations for topics or guests for upcoming episodes.

Here’s the FriendFeed room Dave mentioned.

Announcing Emailtoid: mapping email addresses to OpenIDs

EmailtoidThe other night at Beer and Blog in Portland, fellow Vidooper Michael T Richardson announced and launched a new service that I’m both excited and a little apprehensive about.

The service is called Emailtoid, and while I prefer to pronounce is “email-toyed”, others might pronounce it “email two eye-dee”. And depending on your pronunciation, you might realize that this service is about using an email address as an ID — specifically an OpenID.

This is not a new idea, and it’s one that been debated and discussed in the OpenID community an awful lot, which culminated in a rough outline of how it might work by Brad Fitzpatrick following the Social Graph FOO Camp this past spring, and that David Fuelling turned into an early draft spec.

Well, we looked at this work and this discussion and felt that sooner or later, in spite of all the benefits of using actual URLs for identity, that someone needed to take a lead and actually build out this concept so we have something real to banter about.

The pragmatic reality is that many people are comfortable using email addresses as their identity online for signing up to new services; furthermore, many, many more people have email addresses who don’t also have URLs or homepages that they call their own (or can readily identify). And forcing people to learn yet another form of identifier for the web to satisfy the design of a protocol for arguably marginal value with a lesser user experience also doesn’t make sense. Put another way: the limitations of the technology should not be forced on end users, especially when it doesn’t need to be. And that’s why Emailtoid is a necessary experiment towards advancing identity on the web.

How it works

Emailtoid is a very simple service, and in fact is designed for obsolescence. It’s meant as a fallback for now, enabling relying parties to accept email addresses as identifiers without requiring the generation of a new local password and without requiring the address owner to give up or reveal their existing email credentials (otherwise known as the “password anti-pattern“).

Enter your email - Emailtoid

The flow works like this:

  1. Users enter either an OpenID or email address into a typical OpenID input field. For the purpose of this flow, we’ll presume an email address is used.
  2. The relying party splits email addresses at the ‘@’ symbol into the username and the domain, generating a directed identity request to the email domain. If an XRDS, YADIS or XRDS-Simple document is discovered at the domain, the typical OpenID flow is invoked.
  3. If no discovery document is found, the service falls back to Emailtoid (sending a request like http://emailtoid.net/mapper?email=jane@example.com), where users verify that they own the supplied email addresses by providing their one-time access token that Emailtoid mailed to them.
  4. At this point, users may optionally associate an existing OpenID with their email address, or use the OpenID auto-generated by Emailtoid. Emailtoid is not intended to serve as a full-featured OpenID provider, and we encourage using an OpenID from a third-party OpenID provider.
  5. In the case where users supply and verify their own OpenID, Emailtoid will create a 302 HTTP redirect removing Emailtoid from future interactions completely.

Should an email provider supply a discovery document after an Emailtoid mapping has been made, the new mapping will take precedence.

Opportunities and issues

The drive behind Emailtoid, again, is to reduce the friction of OpenID by reusing familiar identifiers (i.e. email addresses). Clearly the challenges of achieving OpenID adoption are not simply technological, and to a great degree rely on how the user experience needs to become more streamlined and deliver on the promise of greater security and convenience.

Therefore, if a service advertises that they support signing in with an email address, they must keep that promise.

Unfortunately, until all email providers do some kind of local resolution and OpenID authentication, we will need a centralized mapper such as Emailtoid to provide the fallback mapping. And therein lies the rub, defeating some of the distributed design of OpenID.

If anything, Emailtoid is intended to drive forward a conversation about the experience of OpenID, and about how we can make the protocol compatible with, or complementary to, existing and well-known means of identifying oneself on the web. Is it a final solution? Probably not — but it’s up, it’s running, it works and it forces us now to look critically at the question of emails as OpenIDs, now that we can actually experience the flow, and the feeling, of entering an email address into an OpenID box without ever having to enter, or create, another unnecessary password.

A conversation about social network interop and activity stream relevance

Brian Oberkirch captured some video today of a conversation between him, David Recordon and myself at GSP East about social network interop, among other things.

Hot on the heals on my last post, this conversation is rather timely!

Adding richness to activity streams

This is a post I’ve wanted to do for awhile but simply haven’t gotten around to it. Following my panel with Dave Recordon (Six Apart), Dave Morin (Facebook), Adam Nash (LinkedIn), Kevin Chou (Watercooler, Inc) and Sean Ammirati (ReadWriteWeb) on Social Networks and the NEED for FEEDs, it only seems appropriate that I would finally get this out.

The basic premise is this: lifestreams, alternatively known as “activity streams”, are great for discovering and exploring social media, as well as keeping up to date with friends (witness the main feature of Facebook and the rise of FriendFeed). I suggest that, with a little effort on the publishing side, activity streams could become much more valuable by being easier for web services to consume, interpret and to provide better filtering and weighting of shared activities to make it easier for people to get access to relevant information from people that they care about, as it happens.

By marking up social activities and social objects, delivered in standard feeds with microformats, I think we enable anyone to run a FriendFeed-like service that innovates and offers value based on how well it understands what’s going on and what’s relevant, rather than on its compatibility with any and every service.

Contemporary example activities

Here are the kinds of activities that I’m talking about (note that some services expand these with thumbnail previews):

  • Eddie updated his resume at LinkedIn.
  • Chris listened to “I Will Possess Your Heart” by Death Cab for Cutie on Pandora.
  • Brynn favorited a photo on Flickr.
  • Dave posted a message to Twitter via SMS.
  • Gary poked Kastner.
  • Leah bought The Matrix at Amazon.com.

Prior art

Both OpenSocial and Facebook provide APIs for creating new activities that will show up in someone’s activity stream or newsfeed.

Movable Type and the DiSo Project both have Action Stream plugins. And there are countless related efforts. Clearly there’s existing behavior out there… but should we go about improving it, where the primary requirement is a title of an action, and little, if any, guidance on how to provide more details on a given activity?

Components of an activity

Not surprisingly, a lot of activities provide what all good news stories provide: the who, what, when, where and sometimes, how.

Let’s take a look at an example, with these components called out:

e.g. Chris started listening to a station on Pandora 3 hours ago.

  • actor/subject (noun/pronoun)
  • action (verb)
  • social object (noun)
  • where (place)
  • when (time)
  • (how the object was created)
  • (expanded view of object)

Now, I’ll grant that not all activities follow this exact format, but the majority seem to.

I should point out one alternative: collective actions.

e.g. Chris and Dave Morin are now friends.

…but these might be better created as a post-processing step once we add the semantic salt to the original updates. Maybe.

Class actions

One of the assumptions I’m making is that there is some regularity and uniformity in activity streams. Moreover, there have emerged some basic classes of actions that appear routinely and that could be easily expressed with additional semantics.

To that end, I’ve started compiling such activities on the DiSo wiki.

Once we have settled on the base set of classes, we can start to develop common classnames and presentation templates. To start, we have: changed status or presence, posted messages or media, rated and favorited, friended/defriended, interacted with someone (i.e. “poking”), bookmarked, and consumed something (attended…, watched…, listened to…).

Combining activities with bundling

The concept of bundling is already present in OpenSocial and works for combining multiple activities of the same kind into a group:

FriendFeed Activity Bundling

This can also be used to bundle different kinds of activities for a single actor:

e.g. Chris watched The Matrix, uploaded five photos, attended an event and became friends with Dave.

From a technical perspective, bundling provides a mechanism for batching service-to-service operations, as defined in PaceBatch.

Bundling is also useful for presenting paged or “continued…” activities, as Facebook and FriendFeed do.

Advanced uses

I’d like to describe two advanced uses that inherit from my initial proposal for Twitter Hashtags: filtering and creating a distributed track-like service.

In the DiSo model, we use (will use) AtomPub (and someday XMPP) to push new activities to people who have decided to follow different people. Because the model is push-based, activities are delivered as they happen, to anyone who has subscribed to receive them. On the receiving end, this means that we can filter based on any number of criteria, such as actor, activity type, content of the activity (as in keywords or tags), age of the action, location or how an activity was created (was this message auto-generated from Brightkite or sent in by SMS?) or any combination therein.

This is useful if you want to follow certain activities of your friends more closely than others, or if you only care about, say, the screenshots I upload to Flickr but not the stuff I tweet about.

Tracking can work two ways: where your own self-hosted service knows how to elevate certain types of received activities which are then passed to your messaging hub and routed appropriately… for example, when Mom checks in using Brightkite at the airport (or within some distance radius).

On the other hand, individuals could choose to publish their activities to some third-party aggregator (like Summize) and do the tracking for individuals, pushing back activities that it discovers that matches criteria that you set, and then forwarding those activities to your messaging hub.

It might not have the legs that a centralized service like Twitter has, especially to start, but if Technorati were looking for a new raison d’etre, this might be it.

This is a 30,000 foot view

I was scant on code in this post, but given how long it was already, I’d rather just start throwing it into the output of the activity streams being generated from the Action Streams plugins and see how live code holds up in the wild.

I also don’t want to confuse too many implementation details with the broader concept and need, which again is to make activity streams richer by standardizing on some specific semantics based on actual trends.

I’d love feedback, more pointers to prior art, or alternative suggestions for how any of the above could be technically achieved using open technologies.

Thoughts on dynamic privacy

A highly touted aspect of Facebook Connect is the notion of “dynamic privacy“:

As a user moves around the open Web, their privacy settings will follow, ensuring that users’ information and privacy rules are always up-to-date. For example, if a user changes their profile picture, or removes a friend connection, this will be automatically updated in the external website.

Over the course of the Graphing Social Patterns East conference here in DC, Dave Morin and others from Facebook’s Developer Platform have made many a reference to this scheme but have provided frustratingly scant detail on how it will actually work.

Friend Connect - Disable by Facebook

In a conversation with Brian Oberkirch and David Recordon, it dawned on me that the pieces for Dynamic Privacy are already in place and that, to some degree, it seems that it’s really just a matter of figuring out how to effectively enforce policy across distributed systems in order to meet user expectations.

MySpace actually has made similar announcements in their Data Availability approach, and if you read carefully, you can spot the fundamental rift between the OpenSocial and Facebook platforms:

Additionally, rather than updating information across the Web (e.g. default photo, favorite movies or music) for each site where a user spends time, now a user can update their profile in one place and dynamically share that information with the other sites they care about. MySpace will be rolling out a centralized location within the site that allows users to manage how their content and data is made available to third party sites they have chosen to engage with.

Indeed, Recordon wrote about this on O’Reilly Radar last month (emphasis original):

He explained that MySpace said that due to their terms of service the participating sites (e.g. Twitter) would not be allowed to cache or store any of the profile information. In my mind this led to the Data Availability API being structured in one of two ways: 1) on each page load Twitter makes a request to MySpace fetching the protected profile information via OAuth to then display on their site or 2) Twitter includes JavaScript which the browser then uses to fill in the corresponding profile information when it renders the page. Either case is not an example of data portability no matter how you define the term!

Embedding vs sharing

So the major difference here is in the mechanism of data delivery and how the information is “leased” or “tethered” to the original source, such that, as Morin said, “when a user deletes an item on Facebook, it gets deleted everywhere else.”

The approach taken by Google Gadgets, and hence OpenSocial, for the most part, has been to tether data back to the source via embedded iframes. This means that if someone deletes or changes a social object, it will be deleted or changed across OpenSocial containers, though they won’t even notice the difference since they never had access to the data to begin with.

The approach that seems likely from Facebook can be intuited by scouring their developer’s terms of service (emphasis added):

You can only cache user information for up to 24 hours to assist with performance.

2.A.4) Except as provided in Section 2.A.6 below, you may not continue to use, and must immediately remove from any Facebook Platform Application and any Data Repository in your possession or under your control, any Facebook Properties not explicitly identified as being storable indefinitely in the Facebook Platform Documentation within 24 hours after the time at which you obtained the data, or such other time as Facebook may specify to you from time to time;

2.A.5) You may store and use indefinitely any Facebook Properties that are explicitly identified as being storable indefinitely in the Facebook Platform Documentation; provided, however, that except as provided in Section 2.A.6 below, you may not continue to use, and must immediately remove from any Facebook Platform Application and any Data Repository in your possession or under your control, any such Facebook Properties: (a) if Facebook ceases to explicitly identify the same as being storable indefinitely in the Facebook Platform Documentation; (b) upon notice from Facebook (including if we notify you that a particular Facebook User has requested that their information be made inaccessible to that Facebook Platform Application); or (c) upon any termination of this Agreement or of your use of or participation in Facebook Platform;

2.A.6) You may retain copies of Exportable Facebook Properties for such period of time (if any) as the Applicable Facebook User for such Exportable Facebook Properties may approve, if (and only if) such Applicable Facebook User expressly approves your doing so pursuant to an affirmative “opt-in” after receiving a prominent disclosure of (a) the uses you intend to make of such Exportable Facebook Properties, (b) the duration for which you will retain copies of such Exportable Facebook Properties and (c) any terms and conditions governing your use of such Exportable Facebook Properties (a “Full Disclosure Opt-In”);

2.B.8) Notwithstanding the provisions of Sections 2.B.1, 2.B.2 and 2.B.5 above, if (and only if) the Applicable Facebook User for any Exportable Facebook Properties expressly approves your doing so pursuant to a Full Disclosure Opt-In, you may additionally display, provide, edit, modify, sell, resell, lease, redistribute, license, sublicense or transfer such Exportable Facebook Properties in such manner as, and only to the extent that, such Applicable Facebook User may approve.

This is further expanded in the platform documentation on Storable Information:

Per the Developer Terms of Service, you may not cache any user data for more than 24 hours, with the exception of information that is explicitly “storable indefinitely.” Only the following parameters are storable indefinitely; all other information must be requested from Facebook each time.

The storable IDs enable you to keep unique identifiers for Facebook elements that correspond to data gathered by your application. For instance, if you collected information about a user’s musical tastes, you could associate that data with a user’s Facebook uid.

However, note that you cannot store any relations between these IDs, such as whether a user is attending an event. The only exception is the user-to-network relation.

I imagine that Facebook Connect will work by “leasing” or “sharing” information to remote sites and require, through agreement and compliance with their terms, to check in periodically (or to receive directives through a push mechanism) for changes to data, and then to flush caches of stored data every 24 hours or less.

In either model there is still a central provider and store of the data, but the question for implementation really comes down to whether a remote site ever has direct access to the data, and if so, how long it is allowed to store it.

Of note is the OpenSocial RESTful API, which provides a web-friendly mechanism for addressing and defining resources. Recordon pointed out to me that this API affords all the mechanisms necessary to implement the “leased” model of data access (rather than the embedded model), but leaves it up to the OpenSocial applications and containers to set and enforce their own data access policies.

…Which is a world of a difference from Facebook’s approach to date, for which there is neither code nor a spec nor an open discussion about how they’re thinking through the tenuous issues imbued in making decisions around data access, data control, “tethering” and “portability“. While folks like Plaxo and Yahoo are actually shipping code, Facebook is still posturing, assuring us to “wait and see”. With something so central and so important, it’s disheartening that Facebook’s “Open” strategy is anything but open, and everything less than transparent.

Inventing contact schemas for fun and profit! (Ugh)

And then there were three.

Today, Yahoo! announced the public availability of their own Address Book API. Though Plaxo and LinkedIn have been using this API behind the scenes for a short while, today marks the first time the API is available for anyone who registers for an App ID to make use of the bi-directional protocol.

The API is shielded behind Yahoo! proprietary BBAuth protocol, which obviates the need to request Yahoo! member credentials at the time of import initiation, as seen in this screenshot from LinkedIn (from April):

LinkedIn: Expand your network

Now, like Joseph, I applaud the release of this API, as it provides one more means for individuals to have utter control and access to their friends, colleagues and contacts using a robust protocol.

However, I have to lament yet more needless reinvention of contact schema. Why is this a problem? Well, as I pointed out about Facebook’s approach to developing their own platform methods and formats, having to write and debug against yet another contact schema makes the “tax” of adding support for contact syncing and export increasingly onerous for sites and web services that want to better serve their customers by letting them host and maintain their address book elsewhere.

This isn’t just a problem that I have with Yahoo!. It’s something that I encountered last November with the SREG and proposed Attribute Exchange profile definition. And yet again when Google announced their Contacts API. And then again when Microsoft released theirs! Over and over again we’re seeing better ways of fighting the password anti-pattern flow of inviting friends to new social services, but having to implement support for countless contact schemas. What we need is one common contacts interchange format and I strongly suggest that it inherit from vcard with allowances or extension points for contemporary trends in social networking profile data.

I’ve gone ahead and whipped up a comparison matrix between the primary contact schemas to demonstrate the mess we’re in.

Below, I have a subset of the complete matrix to give you a sense for where we’re at with OpenSocial (né GData), Yahoo Address Book API and Microsoft’s Windows Live Contacts API, and include vcard (RFC 2426) as the cardinal format towards which subsequent schemas should converge:

vcard OpenSocial 0.8 Windows Live Contacts API Yahoo Address Book API
UID uid url id cid cid
Nickname nickname nickname NickName nickname
Full Name n or fn name NameTitle, FirstName, MiddleName, LastName, Suffix name
First name n (given-name) given_name FirstName name (first)
Last name n (family-name) family_name LastName name (last)
Birthday bday date_of_birth Birthdate birthday (day, month, year)
Anniversary Anniversary anniversary (day, month, year)
Gender gender gender gender
Email email email Email (ID, EmailType, Address, IsIMEnabled, IsDefault) email
Street street-address street-address StreetLine street
Postal Code postal-code postal-code PostalCode zip
City locality locality
State region region PrimaryCity state
Country country-name country CountryRegion country
Latitutude geo (latitude) latitude latitude
Longitude geo (longitude) longitude longitude
Language N/A N/A
Phone tel (type, value) phone (number, type) Phone (ID, PhoneType, Number, IsIMEnabled, IsDefault) phone
Timezone tz time_zone TimeZone N/A
Photo photo thumbnail_url N/A
Company org organization.name CompanyName company
Job Title title, role organization.title JobTitle jobtitle
Biography note about_me notes
URL url url URI (ID, URIType, Name, Address) link
Category category, rel-tag tags Tag (ID, Name, ContactIDs)

Parsing the “open” in Facebook’s “fbOpen” platform

fbOpenYesterday, as expected, Facebook revealed the code behind their F8 platform, a little over a year after its launch, offering it under the Common Public Attribution License.

I can’t help but notice the glaring addition of Section 15: Network Use and Exhibits A and B to the CPAL license. But I’ll dive into those issues in a moment.

For now it is worth reviewing Facebook’s release in the context of the OSI’s definition of open source; of particular interest are the first three sections: Free Redistribution, Source Code, and Derived Works. Arguably Facebook’s use of the CPAL so far fits the OSI’s definition. It’s when we get to the ninth attribute (License Must Not Restrict Other Software) where it becomes less clear whether Facebook is actually offering “open source” code, or is simply diluting the term for its own gain, given the attribution requirement imposed in Exhibit B:

Each time an Executable, Source Code or Larger Work is launched or initially run (including over a network), a display of the Attribution Information must occur on the graphic user interface employed by the end user to access such Covered Code (which may include a splash screen).

In other words, any derivative work cleft from the rib of Facebook must visibly bear the mark of the “Initial Developer”, namely, Facebook, Inc., and include the following:

Attribution Copyright Notice: Copyright © 2006-2008 Facebook, Inc.
Attribution Phrase (not exceeding 10 words): Based on Facebook Open Platform
Attribution URL: http://developers.facebook.com/fbopen
Graphic Image as provided in the Covered Code: http://developers.facebook.com/fbopen/image/logo.png

Most curious of all is how Facebook addressed a long-held concern of Tim O’Reilly that open source licenses are obsolete in the era of network computing and Web 2.0 (emphasis original):

…it’s clear to me at least that the open source activist community needs to come to grips with the change in the way a great deal of software is deployed today.

And that, after all, was my message: not that open source licenses are unnecessary, but that because their conditions are all triggered by the act of software distribution, they fail to apply to many of the most important types of software today, namely Web 2.0 applications and other forms of software as a service.

And in the Facebook announcement, Ami Vora states:

The CPAL is community-friendly and reflects how software works today by recognizing web services as a major way of distributing software.

Thus Facebook neatly skirts this previous limitation in most open source licenses by amending Section 15 to the CPAL, explicitly covering “Network Use”:

The term ‘External Deployment’ means the use, distribution, or communication of the Original Code or Modifications in any way such that the Original Code or Modifications may be used by anyone other than You, whether those works are distributed or communicated to those persons or made available as an application intended for use over a network. As an express condition for the grants of license hereunder, You must treat any External Deployment by You of the Original Code or Modifications as a distribution under section 3.1 and make Source Code available under Section 3.2.

I read this as referring to network deployments of the Facebook platform on other servers (or available as a web service) and forces both the release of code modifications that hit the public wire as well as imposing the display of the “Attribution Information” (as noted above).

. . .

So okay, first of all, we’re not really dealing with the true historic definition of open source here, but we can mince words later. The code is available, is free to be tinkered with, reviewed, built on top of, redistributed (with that attribution restriction) and there’s even a mechanism for providing feedback and logging bugs. Best of all, if you submit a patch that is accepted, they’ll send you a Facebook T-shirt! (Wha-how! Where do I sign up?!)

Not ironically, Facebook’s approach with smells an awful lot like Microsoft’s Shared Source Initiative (some background). Consider the purpose of one of Microsoft’s three Shared Source licenses, the so-called “Reference License”:

The Microsoft Reference License is a reference-only license that allows licensees to view source code in order to gain a deeper understanding of the inner workings of a given technology. It does not allow for modification or redistribution. Microsoft uses this license primarily for technologies such as its development libraries.

Now compare that with the language of Facebook’s announcement:

The goal of this release is to help you as developers better understand Facebook Platform as a whole and more easily build applications, whether it’s by running your own test servers, building tools, or optimizing your applications on this technology. We’ve built in extensibility points, so you can add functionality to Facebook Open Platform like your own tags and API methods.

While it’s certainly conceivable that there may be intrepid entrepreneurs that decide to extend the platform and release their own implementations (which, arguably would require a considerable amount of effort and infrastructure to duplicate the still-proprietary innards of Facebook proper — remember that the fbOpen platform IS NOT Facebook), they’d still need to attach the Facebook brand to their derivative work and open source their modifications, under a CPAL-compatible license (read: not GPL).

In spite of all this — and whether Facebook is really offering a “true” open source product or not — is really not the important thing. I’m raising issues simply to put this move into a broader context, highlighting some important decision points where Facebook zagged where others might have otherwise zigged, based on their own priorities and aspirations with the move. Put simply: Facebook’s approach to open source is nothing like Google’s, and it’s critical that people considering building on either the fbOpen platform or OpenSocial do themselves a favor and familiarize themselves with the many essential differences.

Furthermore, in light of my recent posts, it occurs to me that the nature of open source is changing (or being changed) by the accelerating move to cloud computing architectures (where the source code is no longer necessarily a strategic asset, but where durable and ongoing access to data is the primary concern (harkening to Tim O’Reilly’s frequent “Data is the Intel Inside” quip) and how Facebook is the first of a new class of enterprises that’s growing up after open source.

I hope to expand on this line of thinking, but I’m starting to wonder — with regards to open source becoming essentially passé nowadays — did we win? Are we on top? Hurray? Or, did we bet on the wrong horse? Or, did the goalposts just move on us (again)? Or, is this just the next stage in an ongoing, ever-volatile struggle to balance the needs of business models that tend towards centralization against those more free-form and freedom seeking and expanding models where information and knowledge must diffuse, and must seek out growth and new hosts in order to continue to become more valuable. Again, pointing to Tim’s contention that Web 2.0 is also at least partly about harnessing collective intelligence, and that data sources that grow richer as more people use them is a facet of the landscape, what does openness mean now? What barriers do we need to dissemble next? If it’s no longer the propriety of software code, then is it time that we began, in earnest, to scale the walls of the proprietary data horders and collectors and take back (or re-federate) what might be rightfully ours? Or that we should at least be given permanent access to? Hmm?


Related coverage:

Facebook, the USSR, communism, and train tracks

Low hills closed in on either side as the train eventually crawled on to high, tabletop grasslands creased with snow. Birds flew at window level. I could see lakes of an unreal cobalt blue to the north. The train pulled into a sprawling rail yard: the Kazakh side of the Kazakhstan-China border.

Workers unhitched the cars, lifted them, one by one, ten feet high with giant jacks, and replaced the wide-gauge Russian undercarriages with narrower ones for the Chinese tracks. Russian gauges, still in use throughout the former Soviet Union, are wider than the world standard. The idea was to the prevent invaders from entering Russia by train. The changeover took hours.

— Robert D. Kaplan, The Ends of the Earth

I read this passage today while sunning myself at Hope Springs Resort near Palm Springs. Tough life, I know.

The passage above immediately made me think of Facebook, and I had visions of the old Facebook logo with a washed out Stalin face next to the wordmark (I’m a visual person). But the thought came from some specific recent developments, and fit into a broader framework that I talked about loosely to Steve Gillmor about on his podcast. I also wrote about it last week, essentially calling for Facebook and Google to come together to co-develop standards for the social web, but, having been reading up on Chinese, Russian, Turkish and Central Asian history, and being a benefactor of the American enterprise system, I’m coming over to Eran and others‘ point that 1) it’s too early to standardize and 2) it probably isn’t necessary anyway. Go ahead, let a thousand flowers bloom.

If I’ve learned anything from Spread Firefox, BarCamp, coworking and the like, it’s that propaganda needs to be free to be effective. In other words, you’re not going to convince people of your way of thinking if you lock down what you have, especially if what you have is culture, a mindset or some other philosophical approach that helps people narrow down what constitutes right and wrong.

Look, if Martin Luther had nailed his Ninety-five Theses to the door but had ensconced them in DRM, he would not have been as effective at bringing about the Reformation.

Likewise, the future of the social web will not be built on proprietary, closed-source protocols and standards. Therefore, it should come as no surprise that Google wants OpenSocial to be an “open standard” and Facebook wants to be the openemest of them all!

The problem is not about being open here. Everyone gets that there’s little marginal competitive advantage to keeping your code closed anymore. Keeping your IP cards close to your chest makes you a worse card player, not better. The problem is with adoption, gaining and maintaining [developer] interest and in stoking distribution. And, that brings me to the fall of the Communism and the USSR, back where I started.

I wasn’t alive back when the Cold War was in its heyday. Maybe I missed something, but let’s just go on the assumption that things are better off now. From what I’m reading in Kaplan’s book, I’d say that the Soviets left not just social, but environmental disaster in their wake. The whole region of Central Asia, at least in the late 90s, was fucked. And while there are many causes, more complex than I can probably comprehend, a lot of it seems to have to do with a lack of cultural identity and a lack of individual agency in the areas affected by, or left behind by, Communist rule.

Now, when we talk about social networks, I mean, c’mon, I realize that these things aren’t exactly nations, nation-states or even tribal groups warring for control of natural resources, food, potable water, and so forth. BUT, the members of social networks number in the millions in some cases, and it would be foolish not to appreciate that the borders — the meticulously crafted hardline boundaries between digital nation-states — are going to be redrawn when the battle for cultural dominance between Google (et al) and Facebook is done. It’s not the same caliber of détente that we saw during the Cold War but it’s certainly a situation where two sides with very different ideological bents are competing to determine the nature of the future of the [world]. On the one hand, we have a nanny state who thinks that it knows best and needs to protect its users from themselves, and on the other, a lassé-faire-trusting band of bros who are looking to the free market to inform the design of the Social Web writ large. On the one hand, there’s uncertainty about how to build a “national identity”-slash-business on top of lots of user data (that, oh yeah, I thought was supposed to be “owned” by the creators), and on the other, a model of the web, that embraces all its failings, nuances and spaghetti code, but that, more than likely, will stand the test of time as a durable provider of the kind of liberty and agency and free choice that wins out time and again throughout history.

That Facebook is attempting to open source its platform, to me, sounds like offering the world a different rail gauge specification for building train tracks. It may be better, it may be slicker, but the flip side is that the Russians used the same tactic to try to keep people from having any kind of competitive advantage over their people or influence over how they did business. You can do the math, but look where it got’em.

S’all I’m sayin’.

Machine tagging relationships

I’ve been doing quite a bit of thinking about how to represent relationships in portable contact lists. Many of my concerns stem from two basic problems:

  1. Relationships in one context don’t necessarily translate directly into new contexts. When we talk about making relationships “portable”, we can’t forget that a friend on one system isn’t necessarily the same kind of friend on another system (if at all) even if the other context uses the same label.
  2. The semantics of a relationship should not form the basis for globally setting permissions. That is, just because someone is marked (perhaps accurately) as a family member does not always mean that that individual should be granted elevated permissions just because they’re “family”. While this approach works for Flickr, where how you classify a relationship (Contact, Friend, Family) determines what that contact can (or can’t) see, semantics alone shouldn’t determine how permissions are assigned.

Now, stepping back, it’s worth pointing out that I’m going on a basic presumption here that moving relationships from one site to another is valuable and beneficial. I also presume that the more convenient it is to find or connect with people who I already know (or have established acquaintance with) on a site will lead me to explore and discover that site’s actual features faster, rather than getting bogged down in finding, inviting and adding friends, which in and of itself has no marginal utility.

Beyond just bringing my friends with me is the opportunity to leverage the categorization I’ve done elsewhere, but that’s where existing formats like and FOAF appear to fall short. On the one hand, we have overlapping terms for relationships that might not mean the same thing in different places, and on the other, we have unique relationship descriptions that might not apply elsewhere (e.g. fellow travelers on Dopplr). This was one of the reasons why I proposed focusing on the “contact” and “me” relationships in XFN (I mean really, what can you actually do if you know that a particular contact is a “muse” or “kin”?). Still, if metadata about a relationship exists, we shouldn’t just discard it, so how then might we express it?

Well, to keep the solution as simple and generalizable as possible, we’d see that the kinds of relationships and the semantics which we use to describe relationships can be reduced to tags. Given a context, it’s fair to infer that other relationships of the same class in the same context are equivalent. So, if I mark two people as “friends” on Flickr, they are equally “Flickr friends”. Likewise on Twitter, all people who I follow are equally “followed”. Now, take the link-rel approach from HTML, and we have a shorthand attribute (“rel”) that we can use to create a that follows the standard namespace:predicate=value format, like so:


flickr:rel=friend
flickr:rel=family
twitter:rel=followed
dopplr:rel=fellow-traveler
xfn:rel=friend
foaf:rel=knows

Imagine being able to pass your relationships between sites as a series of machine tagged URLs, where you can now say “I want to share this content with all my [contacts|friends|family members] from [Flickr]” or “Share all my restaurant reviews from this trip with my [fellow travelers] from [Dopplr|TripIt].” By machine tagging relationships, not only do we maintain the fidelity of the relationship with context, but we inherit a means of querying against this dataset in a way that maps to the origin of the relationship.

Furthermore, this would enable sites to use relationship classification models from other sites. For example, a site like Pownce could use the “Twitter model” of followers and followed; SmugMug could use Flickr’s model of contacts, friends and family; Basecamp could use Plaxo’s model of business, friend and family.

Dumping this data into a JSON-based format like would also be straight-forward:


{
  "uid": "plaxo-12345",
  "fn": "Joseph Smarr",
  "url": [
    { "value": "http://josephsmarr.com", "type": "home" },
    { "value": "http://josephsmarr.com", "type": "blog" },
  ],
  "category": [ 
    { "value": "favorite" },
    { "value": "plaxo employee" }, 
    { "value": "xfn:rel=met" },
    { "value": "xfn:rel=friend" },
    { "value": "xfn:rel=colleague" },
    { "value": "flickr:rel=friend" },
    { "value": "dopplr:rel=fellow-traveler" },
    { "value": "twitter:rel=follower" } 
  ],
  "created": "2008-05-24T12:00:00Z",
  "modified": "2008-05-25T12:34:56Z"
}

I’m curious to know whether this approach would be useful, or what other possibilities might result from having this kind of data. I like it because it’s simple, it uses a prior convention (most widely supported on Flickr and Upcoming), it maintains original context and semantics. It also means that, rather than having to list every account for a contact as a serialized list with associated rel-values, we’re only dealing in highly portable tags.

I’m thinking that this would be very useful for DiSo, and when importing friends from remote sites, we’ll be sure to index this kind of information.