xfn – Factory Joe

Machine tagging relationships

I’ve been doing quite a bit of thinking about how to represent relationships in portable contact lists. Many of my concerns stem from two basic problems:

Relationships in one context don’t necessarily translate directly into new contexts. When we talk about making relationships “portable”, we can’t forget that a friend on one system isn’t necessarily the same kind of friend on another system (if at all) even if the other context uses the same label.
The semantics of a relationship should not form the basis for globally setting permissions. That is, just because someone is marked (perhaps accurately) as a family member does not always mean that that individual should be granted elevated permissions just because they’re “family”. While this approach works for Flickr, where how you classify a relationship (Contact, Friend, Family) determines what that contact can (or can’t) see, semantics alone shouldn’t determine how permissions are assigned.

Now, stepping back, it’s worth pointing out that I’m going on a basic presumption here that moving relationships from one site to another is valuable and beneficial. I also presume that the more convenient it is to find or connect with people who I already know (or have established acquaintance with) on a site will lead me to explore and discover that site’s actual features faster, rather than getting bogged down in finding, inviting and adding friends, which in and of itself has no marginal utility.

Beyond just bringing my friends with me is the opportunity to leverage the categorization I’ve done elsewhere, but that’s where existing formats like XFN and FOAF appear to fall short. On the one hand, we have overlapping terms for relationships that might not mean the same thing in different places, and on the other, we have unique relationship descriptions that might not apply elsewhere (e.g. fellow travelers on Dopplr). This was one of the reasons why I proposed focusing on the “contact” and “me” relationships in XFN (I mean really, what can you actually do if you know that a particular contact is a “muse” or “kin”?). Still, if metadata about a relationship exists, we shouldn’t just discard it, so how then might we express it?

Well, to keep the solution as simple and generalizable as possible, we’d see that the kinds of relationships and the semantics which we use to describe relationships can be reduced to tags. Given a context, it’s fair to infer that other relationships of the same class in the same context are equivalent. So, if I mark two people as “friends” on Flickr, they are equally “Flickr friends”. Likewise on Twitter, all people who I follow are equally “followed”. Now, take the link-rel approach from HTML, and we have a shorthand attribute (“rel”) that we can use to create a machine tag that follows the standard namespace:predicate=value format, like so:


flickr:rel=friend
flickr:rel=family
twitter:rel=followed
dopplr:rel=fellow-traveler
xfn:rel=friend
foaf:rel=knows

Imagine being able to pass your relationships between sites as a series of machine tagged URLs, where you can now say “I want to share this content with all my [contacts|friends|family members] from [Flickr]” or “Share all my restaurant reviews from this trip with my [fellow travelers] from [Dopplr|TripIt].” By machine tagging relationships, not only do we maintain the fidelity of the relationship with context, but we inherit a means of querying against this dataset in a way that maps to the origin of the relationship.

Furthermore, this would enable sites to use relationship classification models from other sites. For example, a site like Pownce could use the “Twitter model” of followers and followed; SmugMug could use Flickr’s model of contacts, friends and family; Basecamp could use Plaxo’s model of business, friend and family.

Dumping this data into a JSON-based format like jCard would also be straight-forward:


{
  "uid": "plaxo-12345",
  "fn": "Joseph Smarr",
  "url": [
    { "value": "http://josephsmarr.com", "type": "home" },
    { "value": "http://josephsmarr.com", "type": "blog" },
  ],
  "category": [ 
    { "value": "favorite" },
    { "value": "plaxo employee" }, 
    { "value": "xfn:rel=met" },
    { "value": "xfn:rel=friend" },
    { "value": "xfn:rel=colleague" },
    { "value": "flickr:rel=friend" },
    { "value": "dopplr:rel=fellow-traveler" },
    { "value": "twitter:rel=follower" } 
  ],
  "created": "2008-05-24T12:00:00Z",
  "modified": "2008-05-25T12:34:56Z"
}

I’m curious to know whether this approach would be useful, or what other possibilities might result from having this kind of data. I like it because it’s simple, it uses a prior convention (most widely supported on Flickr and Upcoming), it maintains original context and semantics. It also means that, rather than having to list every account for a contact as a serialized list with associated rel-values, we’re only dealing in highly portable tags.

I’m thinking that this would be very useful for DiSo, and when importing friends from remote sites, we’ll be sure to index this kind of information.

It’s high time we moved to URL-based identifiers

Ugh, I had promised not to read TechMeme anymore, and I’ve actually kept to my promise since then… until today. And as soon as I finish this post, I’m back on the wagon, but for now, it’s useful to point to the ongoing Scoble debacle for context and for backstory.

In a nutshell, Robert Scoble has friends on Facebook. These friends all have contact information and for whatever reason, he wants to dump that data into Outlook, his address book of choice. The problem is that Facebook makes it nearly impossible to do this in an automated fashion because, as a technical barrier, email addresses are provided as opaque images, not as easily-parseable text. So Scoble worked with the heretofore “trustworthy” Plaxo crew (way to blow it guys! Joseph, how could you?!) to write a scraper that would OCR the email addresses out of the images and dump them into his address book. Well, this got him banned from the service.

The controversy seems to over whether Scoble had the right to extract his friends’ email addresses from Facebook. Compounding the matter is the fact that these email addresses were not ones that Robert had contributed himself to Facebook, but that his contacts had provided. Allen Stern summed up the issue pretty well: My Social Network Data Is Not Yours To Steal or Borrow. And as Dare pointed out, Scoble was wrong, Facebook was right.

Okay, that’s all well and fine.

You’ll note that this is the same fundamental design flaw of FOAF, the RDF format for storing contact information that preceded the purposely distinct microformats XFN and hcard:

The bigger issue impeding Plaxo’s public support of FOAF (and presumably the main issue that similar services are also mulling) is privacy: FOAF files make all information public and accessible by all, including the contents of the user’s address book (via foaf:knows).

Now, the concern today and the concern back in 2004 was the exposure of identifiers (email addresses) that can also be used to contact someone! By conflating contact information with unique identifiers, service providers got themselves in the untenable situation of not being able to share the list of identifiers externally or publicly without also revealing a mechanism that could be easily abused or spammed.

I won’t go into the benefits of using email for identifiers, because they do exist, but I do want to put forth a proposal that’s both long time in coming and long overdue, and frankly Kevin Marks and Scott Kveton have said it just as well as I could: URLs are people too. Kevin writes:

The underlying thing that is wrong with an email address is that its affordance is backwards — it enables people who have it to send things to you, but there’s no reliable way to know that a message is from you. Conversely, URLs have the opposite default affordance — people can go look at them and see what you have said about yourself, and computers can go and visit them and discover other ways to interact with what you have published, or ask you permission for more.

This is clearly the design advantage of OpenID. And it’s also clearly the direction that we need to go in for developing out distributed social networking applications. It’s also why OAuth is important to the mix, so that when you arrive at a public URL identifier-slash-OpenID, you can ask for access to certain things (like sending the person a message), and the owner of that identifier can decide whether to grant you that privilege or not. It no longer matters if the Scobles of the world leak my URL-based identifiers: they’re useless without the specific permissions that I grant on a per instance basis.

As well, I can give services permission to share the URL-based identifiers of my friends (on a per-instance basis) without the threat of betraying their confidence since their public URLs don’t reveal their sensitive contact information (unless they choose to publish it themselves or provide access to it). This allows me the dual benefit of being able to show up at any random web service and find my friends while not sharing information they haven’t given me permission to pass on to untrusted third parties.

So screen scrape factoryjoe.com all you want. I even have a starter hcard waiting for you, with all the contact information I care to publicly expose. Anything more than that? Well, you’re going to have to ask more politely to get it. You’ve got my URL, now, tell me, what else do you really need?

OAuth 1.0, OpenID 2.0 and up next: DiSo

IIW 2007b is now over and with its conclusion, we have two significant accomplishments, both the sum of months of hard work by some very dedicated individuals, in the release of the OpenID 2.0 and OAuth Core 1.0 specifications.

These are two important protocols that serve as a foundational unit for enabling what’s being called “user-centric identity”, or that I call “citizen-centric identity”. With OpenID for identity and authentication and OAuth for authorizing access to portions of your private data, we move ever closer to inverting the silos and providing greater mobility and freedom of choice, restoring the balance in the marketplace and elevating the level of competition by enabling the production of more compelling social applications without requiring the huge investment it takes to recreate even a portion of the available social graph.

It means that we now have protocols that can begin to put an end to the habit of treating user’s credentials like confetti and instead can offer people the ability to get very specific about they want to share with third parties. And what’s most significant here is that these protocols are open and available for anyone to implement. You don’t have to ask permission; if you want to get involved and do your customers a huge favor, all you have to do is support this work.

To put my … time? … where my mouth is (I haven’t got a whole lot of money to put there) … Steve Ivy and I have embarked on a prototype project to build a social network with its skin inside out. We’re calling it DiSo, or “Distributed Social Networking applications”. The emphasis here is on “distributed”.

In his talk today on Friends List Portability, Joseph Smarr laid out an import set of roles that help to clarify how pieces of applications should be architected:

first of all, people have contact details like email addresses, webpage addresses (URLs), instant messaging handles, phone numbers… and any number of these identifiers can be used to discover someone (you do it now when you import your address book to a social networking site). In the citizen-centric model of the world, it’s up to individuals to maintain these identifiers, and to be very intentional about who they share their identifiers with
Second, the various sites and social networks you use need to publish your friends and contacts lists in a way that is publicly accessible and is machine readable (fortunately XFN does well there). This doesn’t mean that your friends list will be exposed for all the world to see; using OAuth, you can limit access to pieces of your personal social graph, but the point is that it’s necessary for social sites to expose, for your reuse, the identifiers of the people that you know.

With that in mind, Steve and I have started working on a strawman version of this idea by extending my wp-microformatted-blogroll plugin, renaming it to wp-contactlist and focusing on how, at a blog level, we can expose our own contact list beyond the realm of any large social network.

Besides, this, we’re doing some interesting magic that would be useful for whitelisting and cross-functional purposes, like those proposed by Tim Berners-Lee. Except our goal is to implement these ideas in more humane HTML using WordPress as our delivery vehicle (note that this project is intended to be an example whose concepts should be able to be implemented on any platform).

So anyway, we’re using Will Norris’ wp-openid plugin, and when someone leaves a comment on one of our blogs using OpenID, and whose OpenID happens to be in blogroll already, they’ll be listed in our respective blogroll with an OpenID icon and a class on the link indicating that, not only are they an XFN contact, but that they logged into our blog and claimed their OpenID URL as an identifier. With this functionality in place, we can begin to build add in permissioning functionality where other people might subscribe to my blogroll as a source of trusted commenters or even to find identifiers for people who could be trusted to make typographic edits to blog posts.

With the combination of XFN and OpenID, we begin to be able to establish distributed trust meshes, though the exposure of personal social graphs. As more people sign in to my blog with OpenID and leave approved comments, I can migrate them to my public blogroll, allowing others to benefit from the work I’ve done evaluating whether a given identifier might be a spam emitter. Over time, my reliability in selecting and promoting trustworthy identifiers becomes a source of social capital accrual and you’ll want to get on my list, demonstrating the value of playing the role of identity provider more widely.

This will lead us towards the development of other DiSo applications, which I’ve begun mapping out as sketches on my wiki but that I think we can begin to discuss on the DiSo mailing list.