Adding richness to activity streams

This is a post I’ve wanted to do for awhile but simply haven’t gotten around to it. Following my panel with Dave Recordon (Six Apart), Dave Morin (Facebook), Adam Nash (LinkedIn), Kevin Chou (Watercooler, Inc) and Sean Ammirati (ReadWriteWeb) on Social Networks and the NEED for FEEDs, it only seems appropriate that I would finally get this out.

The basic premise is this: lifestreams, alternatively known as “activity streams”, are great for discovering and exploring social media, as well as keeping up to date with friends (witness the main feature of Facebook and the rise of FriendFeed). I suggest that, with a little effort on the publishing side, activity streams could become much more valuable by being easier for web services to consume, interpret and to provide better filtering and weighting of shared activities to make it easier for people to get access to relevant information from people that they care about, as it happens.

By marking up social activities and social objects, delivered in standard feeds with microformats, I think we enable anyone to run a FriendFeed-like service that innovates and offers value based on how well it understands what’s going on and what’s relevant, rather than on its compatibility with any and every service.

Contemporary example activities

Here are the kinds of activities that I’m talking about (note that some services expand these with thumbnail previews):

  • Eddie updated his resume at LinkedIn.
  • Chris listened to “I Will Possess Your Heart” by Death Cab for Cutie on Pandora.
  • Brynn favorited a photo on Flickr.
  • Dave posted a message to Twitter via SMS.
  • Gary poked Kastner.
  • Leah bought The Matrix at Amazon.com.

Prior art

Both OpenSocial and Facebook provide APIs for creating new activities that will show up in someone’s activity stream or newsfeed.

Movable Type and the DiSo Project both have Action Stream plugins. And there are countless related efforts. Clearly there’s existing behavior out there… but should we go about improving it, where the primary requirement is a title of an action, and little, if any, guidance on how to provide more details on a given activity?

Components of an activity

Not surprisingly, a lot of activities provide what all good news stories provide: the who, what, when, where and sometimes, how.

Let’s take a look at an example, with these components called out:

e.g. Chris started listening to a station on Pandora 3 hours ago.

  • actor/subject (noun/pronoun)
  • action (verb)
  • social object (noun)
  • where (place)
  • when (time)
  • (how the object was created)
  • (expanded view of object)

Now, I’ll grant that not all activities follow this exact format, but the majority seem to.

I should point out one alternative: collective actions.

e.g. Chris and Dave Morin are now friends.

…but these might be better created as a post-processing step once we add the semantic salt to the original updates. Maybe.

Class actions

One of the assumptions I’m making is that there is some regularity and uniformity in activity streams. Moreover, there have emerged some basic classes of actions that appear routinely and that could be easily expressed with additional semantics.

To that end, I’ve started compiling such activities on the DiSo wiki.

Once we have settled on the base set of classes, we can start to develop common classnames and presentation templates. To start, we have: changed status or presence, posted messages or media, rated and favorited, friended/defriended, interacted with someone (i.e. “poking”), bookmarked, and consumed something (attended…, watched…, listened to…).

Combining activities with bundling

The concept of bundling is already present in OpenSocial and works for combining multiple activities of the same kind into a group:

FriendFeed Activity Bundling

This can also be used to bundle different kinds of activities for a single actor:

e.g. Chris watched The Matrix, uploaded five photos, attended an event and became friends with Dave.

From a technical perspective, bundling provides a mechanism for batching service-to-service operations, as defined in PaceBatch.

Bundling is also useful for presenting paged or “continued…” activities, as Facebook and FriendFeed do.

Advanced uses

I’d like to describe two advanced uses that inherit from my initial proposal for Twitter Hashtags: filtering and creating a distributed track-like service.

In the DiSo model, we use (will use) AtomPub (and someday XMPP) to push new activities to people who have decided to follow different people. Because the model is push-based, activities are delivered as they happen, to anyone who has subscribed to receive them. On the receiving end, this means that we can filter based on any number of criteria, such as actor, activity type, content of the activity (as in keywords or tags), age of the action, location or how an activity was created (was this message auto-generated from Brightkite or sent in by SMS?) or any combination therein.

This is useful if you want to follow certain activities of your friends more closely than others, or if you only care about, say, the screenshots I upload to Flickr but not the stuff I tweet about.

Tracking can work two ways: where your own self-hosted service knows how to elevate certain types of received activities which are then passed to your messaging hub and routed appropriately… for example, when Mom checks in using Brightkite at the airport (or within some distance radius).

On the other hand, individuals could choose to publish their activities to some third-party aggregator (like Summize) and do the tracking for individuals, pushing back activities that it discovers that matches criteria that you set, and then forwarding those activities to your messaging hub.

It might not have the legs that a centralized service like Twitter has, especially to start, but if Technorati were looking for a new raison d’etre, this might be it.

This is a 30,000 foot view

I was scant on code in this post, but given how long it was already, I’d rather just start throwing it into the output of the activity streams being generated from the Action Streams plugins and see how live code holds up in the wild.

I also don’t want to confuse too many implementation details with the broader concept and need, which again is to make activity streams richer by standardizing on some specific semantics based on actual trends.

I’d love feedback, more pointers to prior art, or alternative suggestions for how any of the above could be technically achieved using open technologies.