social graph – Factory Joe

This past weekend I attended a topic-specific FOO Camp called Social Graph FOO Camp (otherwise known as SGFOO) organized by Scott Kveton and David Recordon (or ray-chor-dohn according to Larry).

Scott’s write up is pretty complete, but I wanted to call out one specific outcome that I think is worth noting.

On Sunday, we had a significant discussion on data portability and about the activities, responsibilities and opportunities of and for the eponymous group which has recently generated much hype and buzz but little, (as far as I’ve see) clarity and/or cogent strategy for advancing its expansive charter:

The purpose of this project is to put all existing data portability technologies and initiatives in context and to promote viable reference implementations (blueprints) to the developer, vendor, and end-user communities.

The frustration over the minimal barrier to “becoming a member” of the group (you simply have to sign up for a mailing list) and the focus on large vendors without advancing an agenda with teeth and clearly defined metrics for success was palpable. But so was the desire to make some progress, and if not come to complete agreement, to at least identify concerns shared by the majority of us and perhaps develop a strategy to deflate the hype to date and get the group moving in a productive direction.

My suggestion was to emulate the work that Tara and I have been doing on the Open Media Web project, which developed out of our work with Songbird where we could sense that there was a real opportunity to explore, but didn’t yet have a clear picture of either the space as it was understood by lead users and experts nor of the outcomes that needed to be advocated. So rather than diving in and promoting technologies or tactics before we had identified the opportunities, challenges and boundaries of the problem domain, we decided to pursue an investigatory strategy, starting with a series of meetups, blog posts and interviews (vlogged on Viddler) that might help us flesh out the actors, ideas and conversations that were already ongoing in the space.

The result of my proposal is captured in this post by Chris Saad to the Data Portability mailing list. I think this is a positive step, and one that I hope will give Data Portability some direction and good work to do over the next several weeks and months. I’d like to go a step further and flesh out my thinking however, before this project gets underway.

These interviews should really be conducted assassin-style (as I like to say) where someone (probably Chris Saad) goes to each major vendor represented (and pimped) by the group (i.e. Google and Facebook, Plaxo, Microsoft, LinkedIn, Flickr, Six Apart, MyStrands, et al) and solicits written (or video) answers to the same five or six questions. Each of these interviews should subsequently be posted to the data portability blog over a series of months.
The goal of these ongoing interviews should be to discover primarily: 1) why these companies joined the group and what their goals are; 2) what they think of when they say “data portability” 3) what challenges are they facing when it comes to offering their vision of data portability at their company? 4) what are the greatest benefits of data portability? 5) what are they doing (if anything) to promote and advance data portability within their organization? 6) what technologies have they implemented (or plan to implement in the next six months) in support of data portability? From these answers, I think we can start to recognize trends in both the headspace of large social networking sites as well as begin to call out certain technologies that might be worth picking up and evangelizing, especially in the interest of interop between multiple parties’ sites.
As such, the advocacy of any particular technological solution by the data portability group today should be immediately abandoned until further research and exploration has occurred. While I was happy to see my favorite stable of technologies listed on the group’s homepage in the early days, I now realize that technology is not the hard part; it’s actually the politics, the policies, the usability and impact on and perception of the individual data owners that are really the first order priorities. Without beginning to address issues in those areas first, the technology conversation will never occur.
In terms of timing, I think that the data portability group has come along more or less at the right time, but that it’s actually walking into the problem ass-backwards. What we don’t need right now is a lot of hype and glorification of an abstruse notion of data portability. In fact, data portability by itself is currently meaningless and intangible; without good examples of how it can be applied to make things better for companies’ customers, there will never be an economic imperative to move in this direction (I should point out that data portability is interesting to me because increased customer choice is interesting to me, and thereby competition in the space is beneficial to the customers of such services). For a timely example of a positive case where data portability is making a difference, consider the ability to move your bookmarks from del.icio.us to Ma.gnolia in lieu of Microsoft’s looming acquisition bid of Yahoo!. Surely there are other equally beneficial applications of data portability, and building out these use cases in terms of end-user benefit is critical to continuing to make the case for data portability with credibility.

So anyway, I do believe that there is an opportunity here and Chris Saad is correct that getting a number of the prominent players in this arena to come to the table on this topic is a feat; however, simply bringing them together without engaging with the gnarly problems and policies that have kept data portability from becoming a reality could bring more confusion and angst than benefit. Deflating the hype and going back to humble beginnings and simple questions is, in my not-so-humble opinion, the appropriate and most effective way forward. Data portability is still not obvious for most people or most companies — heck the technologies that enable it are barely out of their 1.0 and 2.0 phases yet — and still this topic is one that captures people’s imaginations and lets them imagine countless “what if” scenarios that seem, somehow, just around the corner. Data portability is a critical topic, and with the advances in the state of the conversation we had over the weekend, I’m eager to see the members of the data portability group pick up the ball and keep moving it forward.

So, if this topic is something that interests you, I recommend you blog about it, talk about it, interpret it and really take some time to consider what data portability means to you, and why it matters (or doesn’t) to you. Me, Larry and Matt Biddulph of Dopplr rapped about this stuff some more on our Citizen Garden podcast today, so if you’re looking for more information, ideas or fodder, you might go ahead and give it a listen.

Ugh, I had promised not to read TechMeme anymore, and I’ve actually kept to my promise since then… until today. And as soon as I finish this post, I’m back on the wagon, but for now, it’s useful to point to the ongoing Scoble debacle for context and for backstory.

In a nutshell, Robert Scoble has friends on Facebook. These friends all have contact information and for whatever reason, he wants to dump that data into Outlook, his address book of choice. The problem is that Facebook makes it nearly impossible to do this in an automated fashion because, as a technical barrier, email addresses are provided as opaque images, not as easily-parseable text. So Scoble worked with the heretofore “trustworthy” Plaxo crew (way to blow it guys! Joseph, how could you?!) to write a scraper that would OCR the email addresses out of the images and dump them into his address book. Well, this got him banned from the service.

The controversy seems to over whether Scoble had the right to extract his friends’ email addresses from Facebook. Compounding the matter is the fact that these email addresses were not ones that Robert had contributed himself to Facebook, but that his contacts had provided. Allen Stern summed up the issue pretty well: My Social Network Data Is Not Yours To Steal or Borrow. And as Dare pointed out, Scoble was wrong, Facebook was right.

Okay, that’s all well and fine.

You’ll note that this is the same fundamental design flaw of FOAF, the RDF format for storing contact information that preceded the purposely distinct microformats XFN and hcard:

The bigger issue impeding Plaxo’s public support of FOAF (and presumably the main issue that similar services are also mulling) is privacy: FOAF files make all information public and accessible by all, including the contents of the user’s address book (via foaf:knows).

Now, the concern today and the concern back in 2004 was the exposure of identifiers (email addresses) that can also be used to contact someone! By conflating contact information with unique identifiers, service providers got themselves in the untenable situation of not being able to share the list of identifiers externally or publicly without also revealing a mechanism that could be easily abused or spammed.

I won’t go into the benefits of using email for identifiers, because they do exist, but I do want to put forth a proposal that’s both long time in coming and long overdue, and frankly Kevin Marks and Scott Kveton have said it just as well as I could: URLs are people too. Kevin writes:

The underlying thing that is wrong with an email address is that its affordance is backwards — it enables people who have it to send things to you, but there’s no reliable way to know that a message is from you. Conversely, URLs have the opposite default affordance — people can go look at them and see what you have said about yourself, and computers can go and visit them and discover other ways to interact with what you have published, or ask you permission for more.

This is clearly the design advantage of OpenID. And it’s also clearly the direction that we need to go in for developing out distributed social networking applications. It’s also why OAuth is important to the mix, so that when you arrive at a public URL identifier-slash-OpenID, you can ask for access to certain things (like sending the person a message), and the owner of that identifier can decide whether to grant you that privilege or not. It no longer matters if the Scobles of the world leak my URL-based identifiers: they’re useless without the specific permissions that I grant on a per instance basis.

As well, I can give services permission to share the URL-based identifiers of my friends (on a per-instance basis) without the threat of betraying their confidence since their public URLs don’t reveal their sensitive contact information (unless they choose to publish it themselves or provide access to it). This allows me the dual benefit of being able to show up at any random web service and find my friends while not sharing information they haven’t given me permission to pass on to untrusted third parties.

So screen scrape factoryjoe.com all you want. I even have a starter hcard waiting for you, with all the contact information I care to publicly expose. Anything more than that? Well, you’re going to have to ask more politely to get it. You’ve got my URL, now, tell me, what else do you really need?