Adding richness to activity streams

This is a post I’ve wanted to do for awhile but simply haven’t gotten around to it. Following my panel with Dave Recordon (Six Apart), Dave Morin (Facebook), Adam Nash (LinkedIn), Kevin Chou (Watercooler, Inc) and Sean Ammirati (ReadWriteWeb) on Social Networks and the NEED for FEEDs, it only seems appropriate that I would finally get this out.

The basic premise is this: lifestreams, alternatively known as “activity streams”, are great for discovering and exploring social media, as well as keeping up to date with friends (witness the main feature of Facebook and the rise of FriendFeed). I suggest that, with a little effort on the publishing side, activity streams could become much more valuable by being easier for web services to consume, interpret and to provide better filtering and weighting of shared activities to make it easier for people to get access to relevant information from people that they care about, as it happens.

By marking up social activities and social objects, delivered in standard feeds with microformats, I think we enable anyone to run a FriendFeed-like service that innovates and offers value based on how well it understands what’s going on and what’s relevant, rather than on its compatibility with any and every service.

Contemporary example activities

Here are the kinds of activities that I’m talking about (note that some services expand these with thumbnail previews):

Eddie updated his resume at LinkedIn.
Chris listened to “I Will Possess Your Heart” by Death Cab for Cutie on Pandora.
Brynn favorited a photo on Flickr.
Dave posted a message to Twitter via SMS.
Gary poked Kastner.
Leah bought The Matrix at Amazon.com.

Prior art

Both OpenSocial and Facebook provide APIs for creating new activities that will show up in someone’s activity stream or newsfeed.

Movable Type and the DiSo Project both have Action Stream plugins. And there are countless related efforts. Clearly there’s existing behavior out there… but should we go about improving it, where the primary requirement is a title of an action, and little, if any, guidance on how to provide more details on a given activity?

Components of an activity

Not surprisingly, a lot of activities provide what all good news stories provide: the who, what, when, where and sometimes, how.

Let’s take a look at an example, with these components called out:

e.g. Chris started listening to a station on Pandora 3 hours ago.

actor/subject (noun/pronoun)
action (verb)
social object (noun)
where (place)
when (time)
(how the object was created)
(expanded view of object)

Now, I’ll grant that not all activities follow this exact format, but the majority seem to.

I should point out one alternative: collective actions.

e.g. Chris and Dave Morin are now friends.

…but these might be better created as a post-processing step once we add the semantic salt to the original updates. Maybe.

Class actions

One of the assumptions I’m making is that there is some regularity and uniformity in activity streams. Moreover, there have emerged some basic classes of actions that appear routinely and that could be easily expressed with additional semantics.

To that end, I’ve started compiling such activities on the DiSo wiki.

Once we have settled on the base set of classes, we can start to develop common classnames and presentation templates. To start, we have: changed status or presence, posted messages or media, rated and favorited, friended/defriended, interacted with someone (i.e. “poking”), bookmarked, and consumed something (attended…, watched…, listened to…).

Combining activities with bundling

The concept of bundling is already present in OpenSocial and works for combining multiple activities of the same kind into a group:

This can also be used to bundle different kinds of activities for a single actor:

e.g. Chris watched The Matrix, uploaded five photos, attended an event and became friends with Dave.

From a technical perspective, bundling provides a mechanism for batching service-to-service operations, as defined in PaceBatch.

Bundling is also useful for presenting paged or “continued…” activities, as Facebook and FriendFeed do.

Advanced uses

I’d like to describe two advanced uses that inherit from my initial proposal for Twitter Hashtags: filtering and creating a distributed track-like service.

In the DiSo model, we use (will use) AtomPub (and someday XMPP) to push new activities to people who have decided to follow different people. Because the model is push-based, activities are delivered as they happen, to anyone who has subscribed to receive them. On the receiving end, this means that we can filter based on any number of criteria, such as actor, activity type, content of the activity (as in keywords or tags), age of the action, location or how an activity was created (was this message auto-generated from Brightkite or sent in by SMS?) or any combination therein.

This is useful if you want to follow certain activities of your friends more closely than others, or if you only care about, say, the screenshots I upload to Flickr but not the stuff I tweet about.

Tracking can work two ways: where your own self-hosted service knows how to elevate certain types of received activities which are then passed to your messaging hub and routed appropriately… for example, when Mom checks in using Brightkite at the airport (or within some distance radius).

On the other hand, individuals could choose to publish their activities to some third-party aggregator (like Summize) and do the tracking for individuals, pushing back activities that it discovers that matches criteria that you set, and then forwarding those activities to your messaging hub.

It might not have the legs that a centralized service like Twitter has, especially to start, but if Technorati were looking for a new raison d’etre, this might be it.

This is a 30,000 foot view

I was scant on code in this post, but given how long it was already, I’d rather just start throwing it into the output of the activity streams being generated from the Action Streams plugins and see how live code holds up in the wild.

I also don’t want to confuse too many implementation details with the broader concept and need, which again is to make activity streams richer by standardizing on some specific semantics based on actual trends.

I’d love feedback, more pointers to prior art, or alternative suggestions for how any of the above could be technically achieved using open technologies.

Author: Chris Messina

Inventor of the hashtag. #1 Product Hunter. Techmeme Ride Home podcaster. Ever-curious product designer and technologist. Previously: Google, Uber, Republic, YC W'18. View all posts by Chris Messina

21 thoughts on “Adding richness to activity streams”

Ntino says:

Jun 11th at @ 9pm

Hi Chris, good post. To me it seems what as you proposing can be done with a simple RDF schema that incorporates the semantics you posted on the wiki. Next step is the social service developers agree to follow it with the data they send out from their APIs / web services.

Currently even if there’s no standardization, a life-activity stream service can deal with say 30-50 APIs (see number of services aggregated by Friendfeed) by custom-made python/perl/java code for each API. But this cannot scale up as the number of APIs increase; think about if we didn’t had RSS, or to bring you an example from my own field, when trying to built bioinformatic web service workflows (essentially a mashup) where you try glue together different outputs from one API as input to the other. After all the mess was created, people try now to come up with ontologies in order to standardize the web services’ communication endpoints.

But I like your idea, since you propose the standardization before we get to the point within the next couple of years, of too many and much different social APIs, that we need to mashup and aggregate in activity stream – meta social services.
Paul Stamatiou says:

Jun 11th at @ 11pm

Well put Chris. I liked when you explained the push-based portion of DiSo and being able to selectively filter which parts of a person’s activity stream you follow and how closely you follow it.

This is useful if you want to follow certain activities of your friends more closely than others, or if you only care about, say, the screenshots I upload to Flickr but not the stuff I tweet about.

Would you say that activity streams are almost necessary for applications that expect there to be some interactivity between its users and try to establish/identify relationships between users and their activities/fringing on semantic web?
kael says:

Jun 12th at @ 12am

Interesting thoughts. FriendFeed gave me also some ideas regarding social presence streams publication and aggregation.

In this case, I assume everything is published in XMPP and that HTTP resources are polled by a backend service for a Pubsub publication.

I had the idea of a WordPress PEP aggregator which would be a mix of the PEP aggregator and the DiSo ActionStream plugin.

The WP PEP aggregator would publish XMPP PEP events (user-tune, user-mood, user-activity, user-location but also user-browsing – for social bookmarking – and the others PEP events) on a dedicated page or in blog column.

PEP messages could be published in real-time with a BOSH-COMET-AJAX technology (I don’t know much of XMPP over HTPP) for example in a similar manner than this Flash page that displays BBC Radio tunes.

Users could decide which data to publish publicly thanks to the Pubsub access model. And contacts would select which streams they want to follow using the Pubsub auto-subscribe and filtered notifications capabilities or with manual subscription.

There’s also the great Pubsub content-based subscription system to consider, but mostly for other types of social presence streams.

It might then be possible to mix social networks services streams with PEP ones by publishing them with new Pubsub/PEP messages types which could include specific namespaces according to services.

There’s also the proto-XEP Microblogging over XMPP to consider, but I haven’t thought much about its use.

I’m not sure all types of activity streams events worth to be considered, at least that they worth a notification. One problem with all those streams is that they interrupt a lot, so it might be interesting to distinguish “monitoring” vs. “notification” ; the first would use presence, the last, messages.

I’d like also to see a way to publish blog comments with XMPP but haven’t found yet how to do so in a easy manner.

Another interesting path to explore is the mapping of Pubsub nodes and AtomPub collections as experimented here , here, here and there. But this backend lacks some RDF or at least search capabilities, though I’m not sure.

Hopefully, new Jabber GUI will be designed to display the streams in a ergonomic manner.

All of this is essentially XMPP centric and lacks HTTP interfacing but I’d need to think more about this, specially how to bootstrap fetched API data with Pubsub.

It’s also a 30,000 feet view but I hope it could give more ideas to others.

I had these ideas after I read about FriendFeed chat-rooms which made me think to the pubsub#replyroom option.

And BTW, Jabber-Feed, an ATOM Pubsub WP plugin would probably be useful for DiSo.
kael says:

Jun 12th at @ 12am

Interesting thoughts. FriendFeed gave me also some ideas regarding social presence streams publication and aggregation.

In this case, I assume everything is published in XMPP and that HTTP resources are polled by a backend service for a Pubsub publication.

I had the idea of a WordPress PEP aggregator which would be a mix of the PEP aggregator and the DiSo ActionStream plugin.

The WP PEP aggregator would publish XMPP PEP events (user-tune, user-mood, user-activity, user-location but also user-browsing – for social bookmarking – and the others PEP events) on a dedicated page or in blog column.

PEP messages could be published in real-time with a BOSH-COMET-AJAX technology (I don’t know much of XMPP over HTPP) for example in a similar manner than this Flash page that displays BBC Radio tunes.

Users could decide which data to publish publicly thanks to the Pubsub access model. And contacts would select which streams they want to follow using the Pubsub auto-subscribe and filtered notifications capabilities or with manual subscription.

There’s also the great Pubsub content-based subscription system to consider, but mostly for other types of social presence streams.
Richard Marr says:

Jun 12th at @ 2am

Good post. It’d also be interesting (for me at least) to look at how to manage varying expectations of privacy for different types of activity.
Engin Erdogan says:

Jun 12th at @ 10am

Hi Chris,

It is interesting to think about a standardized way of capturing meta-data of use/activities. I believe that there might be a lot of interesting uses of this type of information.

About a 1.5 months ago with a similar motivation in mind, we have started User Labor. With User Labor, we propose an open data structure, User Labor Markup Language (ULML), to outline the metrics of user participation in social web services. You can check it out at http://userlabor.org.

Shoot me an email if you would like to talk more about this.
Dan Benyamin says:

Jun 12th at @ 12pm

Fantastic Chris, count me as a sponsor of this! 😉
Luigi Montanez says:

Jun 12th at @ 5pm

It’s interesting. Technological innovation has always rode on the back of standards. AC electricity at 120V. 8 bits in a byte. TCP/IP. HTTP. XML. Then in the mid 90’s, the entirety of Silicon Valley seemed to have collectively thought, “We don’t need anymore standards”. They proceeded to build services (some awesome, some not so much), and forgot about forging standards to provide some foundation for the innovation they were doing.

Since 1997, there really hasn’t been a truly game-changing, market-shifting standard to come about. The closest thing I can think of is RSS, but regular folks really don’t use it.

The work DiSo is doing by leveraging OpenID, OAuth, Atom, XMPP, et. al. to create an infrastructure based on open standards really is quite revolutionary considering what the industry has been like since the 90’s.
Pingback: Recent Links on Ma.gnolia at Fast Wonder Blog
Stephen Paul Weber says:

Jun 16th at @ 8am

I’m going to suggest (as Chris has) an XHTML-compatible approach (since RSS, ATOM, XMPP, and arbitrary XML can embed XHTML data). It seems to me we have a way to represent all of this except the verbs. ie:

Chris
started listening to
a station on
Pandora
3 hours ago.

If more data is needed (ie, listened to a song with artist and title) – then use the correct existing format (ie, haudio) – which the parser knows is present based on the verb.
Stephen Paul Weber says:

Jun 16th at @ 8am

An example of this idea applied to Twitter by an experimental version of my actionstream plugin:

<li class="hentry service-icon service-twitter actionstream-group-42514be9c2c6fabf146d7bb55ba3cc242"> <span class="author vcard" style="display:none;"> <a class="url fn nickname" href="http://twitter.com/singpolyma">singpolyma</a> </span> <a rel="bookmark" href="http://twitter.com/singpolyma/statuses/835641714" class="verb" title="posted message">tweeted</a>, " <span class="entry-title entry-content">Anti C61 images: http://www.cmcarts.ca/DRM.htm</span>" <abbr class="published" title="2008-06-15T22:17:09-04:00">@ 2008-06-15 22:17</abbr> </li>
ian kennedy says:

Jun 17th at @ 10am

Thanks for kicking this off Chris. We’ve done a lot of thinking about this as well at MyBlogLog and Yahoo and look forward to contributing to this discussion to include more services.

Don’t forget the translation of these verbs into additional languages as well as shorter representations for mobile interfaces.

One thing I would advise today is the grouping that you talk about needs a time-sensitive trigger. On MyBlogLog we group a user’s tweets into hourly clusters for the first 24 hours but after that we group all tweets into daily clusters. Otherwise twitter (last.fm also shows this trait) tends to dominate the lifestream.
Shahar Nechmad says:

Jun 17th at @ 10am

Great post.
I think this standard should include also the notion of importance.
Today when you subscribe to so many activity streams, they start to get the problem with RSS – there are too many things to read.

If you tweet about a new job you got, I’m sure you see this as a very important thing that is important enough to push to al your friends.
While if I just tweet a remark about the game I watch now, I understand that it’s not important enough to interrupt my friends activity at the moment.
Pingback: Defining information relevance for location based services - part 1 « Geographically challenged
kael says:

Jun 18th at @ 1pm

FYI, there exists a new Ejabberd-OpenID module and an Ejabberd-AtomPub module is being written.

They might be of interest, IMHO.

BTW, the OpenID form considers that the length of my base64 OpenID URL is too long, hence the truncating on my first comment.
Siddey says:

Jun 18th at @ 6pm

A very timely post Chris. We’re soon going to see an explosion in location based services take place and players in that space would do well to heed your advice and ensure they can effectively associate stream data with real-world objects where appropriate. I think we’ll see a slight shift in focus from being purely people based, i.e. the “it’s all about me” social factor; to us becoming obsessed with, “tell me about…” exploration as a result of attaching contextualised geo-coded feeds to real-world objects.
kael says:

Jun 19th at @ 12am

The Ejabberd-AtomPub module has been released.
Pingback: Lifestream Posts & Pages for July 8th 2008 | Lifestream Blog
nippur says:

Nov 10th at @ 4am

Great post, chris!

The examples on your wiki hepled me to implement my own version of a primitive activity stream.

i’d like to know if there are any efforts to implement this kind of plugin for joomla. So far i have seen some plugins for wordpress.
Do you know of any development effort beeing made for joomla?
Chris Messina says:

Nov 10th at @ 7am

@nippur: I don’t know of an activity streams plugin for Joomla and a quick Google search turned up zero related results. Perhaps an opportunity for you? 😉
Pingback: The End of Email «