Data portability – Factory Joe

After Social Graph FOO Camp — and a challenge for the Data Portability Group

This past weekend I attended a topic-specific FOO Camp called Social Graph FOO Camp (otherwise known as SGFOO) organized by Scott Kveton and David Recordon (or ray-chor-dohn according to Larry).

Scott’s write up is pretty complete, but I wanted to call out one specific outcome that I think is worth noting.

On Sunday, we had a significant discussion on data portability and about the activities, responsibilities and opportunities of and for the eponymous group which has recently generated much hype and buzz but little, (as far as I’ve see) clarity and/or cogent strategy for advancing its expansive charter:

The purpose of this project is to put all existing data portability technologies and initiatives in context and to promote viable reference implementations (blueprints) to the developer, vendor, and end-user communities.

The frustration over the minimal barrier to “becoming a member” of the group (you simply have to sign up for a mailing list) and the focus on large vendors without advancing an agenda with teeth and clearly defined metrics for success was palpable. But so was the desire to make some progress, and if not come to complete agreement, to at least identify concerns shared by the majority of us and perhaps develop a strategy to deflate the hype to date and get the group moving in a productive direction.

My suggestion was to emulate the work that Tara and I have been doing on the Open Media Web project, which developed out of our work with Songbird where we could sense that there was a real opportunity to explore, but didn’t yet have a clear picture of either the space as it was understood by lead users and experts nor of the outcomes that needed to be advocated. So rather than diving in and promoting technologies or tactics before we had identified the opportunities, challenges and boundaries of the problem domain, we decided to pursue an investigatory strategy, starting with a series of meetups, blog posts and interviews (vlogged on Viddler) that might help us flesh out the actors, ideas and conversations that were already ongoing in the space.

The result of my proposal is captured in this post by Chris Saad to the Data Portability mailing list. I think this is a positive step, and one that I hope will give Data Portability some direction and good work to do over the next several weeks and months. I’d like to go a step further and flesh out my thinking however, before this project gets underway.

These interviews should really be conducted assassin-style (as I like to say) where someone (probably Chris Saad) goes to each major vendor represented (and pimped) by the group (i.e. Google and Facebook, Plaxo, Microsoft, LinkedIn, Flickr, Six Apart, MyStrands, et al) and solicits written (or video) answers to the same five or six questions. Each of these interviews should subsequently be posted to the data portability blog over a series of months.
The goal of these ongoing interviews should be to discover primarily: 1) why these companies joined the group and what their goals are; 2) what they think of when they say “data portability” 3) what challenges are they facing when it comes to offering their vision of data portability at their company? 4) what are the greatest benefits of data portability? 5) what are they doing (if anything) to promote and advance data portability within their organization? 6) what technologies have they implemented (or plan to implement in the next six months) in support of data portability? From these answers, I think we can start to recognize trends in both the headspace of large social networking sites as well as begin to call out certain technologies that might be worth picking up and evangelizing, especially in the interest of interop between multiple parties’ sites.
As such, the advocacy of any particular technological solution by the data portability group today should be immediately abandoned until further research and exploration has occurred. While I was happy to see my favorite stable of technologies listed on the group’s homepage in the early days, I now realize that technology is not the hard part; it’s actually the politics, the policies, the usability and impact on and perception of the individual data owners that are really the first order priorities. Without beginning to address issues in those areas first, the technology conversation will never occur.
In terms of timing, I think that the data portability group has come along more or less at the right time, but that it’s actually walking into the problem ass-backwards. What we don’t need right now is a lot of hype and glorification of an abstruse notion of data portability. In fact, data portability by itself is currently meaningless and intangible; without good examples of how it can be applied to make things better for companies’ customers, there will never be an economic imperative to move in this direction (I should point out that data portability is interesting to me because increased customer choice is interesting to me, and thereby competition in the space is beneficial to the customers of such services). For a timely example of a positive case where data portability is making a difference, consider the ability to move your bookmarks from del.icio.us to Ma.gnolia in lieu of Microsoft’s looming acquisition bid of Yahoo!. Surely there are other equally beneficial applications of data portability, and building out these use cases in terms of end-user benefit is critical to continuing to make the case for data portability with credibility.

So anyway, I do believe that there is an opportunity here and Chris Saad is correct that getting a number of the prominent players in this arena to come to the table on this topic is a feat; however, simply bringing them together without engaging with the gnarly problems and policies that have kept data portability from becoming a reality could bring more confusion and angst than benefit. Deflating the hype and going back to humble beginnings and simple questions is, in my not-so-humble opinion, the appropriate and most effective way forward. Data portability is still not obvious for most people or most companies — heck the technologies that enable it are barely out of their 1.0 and 2.0 phases yet — and still this topic is one that captures people’s imaginations and lets them imagine countless “what if” scenarios that seem, somehow, just around the corner. Data portability is a critical topic, and with the advances in the state of the conversation we had over the weekend, I’m eager to see the members of the data portability group pick up the ball and keep moving it forward.

So, if this topic is something that interests you, I recommend you blog about it, talk about it, interpret it and really take some time to consider what data portability means to you, and why it matters (or doesn’t) to you. Me, Larry and Matt Biddulph of Dopplr rapped about this stuff some more on our Citizen Garden podcast today, so if you’re looking for more information, ideas or fodder, you might go ahead and give it a listen.

The inside-out social network

Anne Zelenka of Web Worker Daily and GigaOM fame wrote me to ask what I meant by “building a social network with its skin inside out” when I was describing DiSo, the project that Steve Ivy and I (and now Will Norris) are working on.

Since understanding this change that I envision is crucial to the potential wider success of DiSo, I thought I’d take a moment and quote my reply about what I see are the benefits of social network built inside-out:

The analogy might sound a little gruesome I suppose, but I’m basically making the case for more open systems in an ecosystem, rather than investing or producing more closed off or siloed systems.

There are a number of reasons for this, many of which I’ve been blogging about lately.

For starters, “citizen centric web services” will arguably be better for people over the long term. We’re in the toddler days of that situation now, but think about passports and credit cards:

your passport provides proof of provenance and allows you to leave home without permanently give up your port of origin (equivalent: logging in to Facebook with your MySpace account to “poke” a friend — why do you need a full Facebook account for that if you’re only “visiting”?);

your credit/ATM cards are stored value instruments, making it possible for you to make transactions without cash, and with great convenience. In addition, while you should choose your bank wisely, you’re always able to withdraw your funds and move to a new bank if you want. This portability creates choice and competition in the marketplace and benefits consumers.

It’s my contention that, over a long enough time horizon, a similar situation in social networks will be better for the users of those networks, and that as reputation becomes portable and discoverable, who you choose to be your identity provider will matter. This is a significant change from the kind of temporariness ascribed by some social network users to their accounts today (see danah boyd).

Anyway, I’m starting with WordPress because it already has some of the building blocks in place. I also recognize that, as a white male with privilege, I can be less concerned about my privacy in the short term to prove out this model, and then, if it works, build in strong cross-silo privacy controls later on. (Why do I make this point? Well, because the network that might work for me isn’t one that will necessarily work for everyone, and so identifying this fact right now will hopefully help to reveal and prevent embedding any assumptions being built into the privacy and relationships model early on.)

Again, we’re in the beginning of all this now and there’ll be plenty of ill-informed people crying wolf about not wanting to join their accounts, or have unified reputation and so on, but that’s normal during the course of an inversion of norms. For some time to come, it’ll be optional whether you want to play along of course, but once people witness and come to realize the benefits and power of portable social capital, their tune might change.

But, as Tara pointed out to me today, the arguments for data portability thus far seem predicated on the wrong value statement. Data portability in and of itself is simply not interesting; keeping track of stuff in one place is hard enough as it is, let alone trying to pass it between services or manage it all ourselves, on our own meager hard drives. We need instead to frame the discussion in terms of real-world benefits for regular people over the situation that we have today and in terms of economics that people in companies who might invest in these technologies can understand, and can translate into benefits for both their customers and for their bottom lines.

I hate to put it in such bleak terms, but I’ve learned a bit since I embarked on a larger personal campaign to build technology that is firmly in the service of people (it’s a long process, believe me). What developers and technologists seem to want at this point in time is the ability to own and extract their data from web services to the end of achieving ultimate libertarian nirvana. While I am sympathetic to these goals and see them as the way to arriving at a better future, I also think that we must account for those folks for whom Facebook represents a clean and orderly experience worth the exchange of their personal data for an experience that isn’t confounding or alienating and gives them (at least the perception) of strong privacy controls. And so whatever solutions we develop, I think the objective should not be to obviate Facebook or MySpace, but to build systems and to craft technologies that will benefit and make such sites more sustainable and profitable, but only if they adopt the best practices and ideals of openness, individual choice and freedom of mobility.

As we architect this technology — keeping in mind that we are writing in code what believe should be the rights of autonomous citizens of the web — we must also keep in mind the wide diversity of the constituents of the web, that much of this has been debated and discussed by generations before us, and that our opportunity and ability to impose our desires and aspirations on the future only grows with our successes in freeing from the restraints that bind them, the current generation of wayward web citizens who have yet to be convinced that the vision we share will actually be an improvement over the way they experience “social networking” today.

Data banks, data brokers and citizen bargaining power

I wrote this this morning in a notebook as a follow up to my post yesterday… and since I don’t have time to clean it up now, I thought I’d present in raw, non-sensible form. Maybe there’s some value in a rough draft:

It’s like giving our money to a bank and having them turn around and sell our data to try to upsell us on loans and all kind of … oh wait, but the key difference is if we do get fed up, we can take our money out and go elsewhere, depriving the bank the ability to both target us with their partners’ ads and the ability to compound interest on our savings.

We need data brokers introduced into the system — organizations who are like safety deposit receptacles for our data — and who speak all APIs and actually advocate on our behalves for better service based on how “valuable” we are — this is necessary to top the scales in our favor — to reintroduce a balancing force into the marketplace because right now the choice to leave means dissing our friends — but if I’m not satisfied but still want to t talk to my friends, why can’t I be on the outside, but sending messages in? hell I’m willing to pay — in momentary access to my brokered personal profile — for access to my friends inside the silo. This is what Facebook is doing by shutting down so many accounts — it’s not personal — it’s protecting its business. They don’t want to become a myspace cesspool, succumb to empty profiles and Gresham’s Law — overrun with spam profiles and leeches and worthless profile data — a barren wasteland for advertisers who want to connect with that 8% of their customers who make up 32% of their revenue.

No it’s in data fidelity, richness, ironically FB took it upon themselves to weed out the bad from the good in their system-wide sweeps. Unfortunately they got it wrong a bunch of times. If Facebook allowed the export of data and became a data broker for its users — provided some citizen agency to its customers — there would be economic — as well as social — benefits to maintaining a clean and rich profile — beyond just expressiveness to one’s friends. For better or worse, FB users have a lot of benefit through the siloed apps of that F8 platform — but the grand vision should be closer to what Google’s marketing department christened “OpenSocial”… still though , the roles of banker and broker have yet to be made explicit and so we’ve leapt to “data portability” for nerds, forgetting that most people 1) don’t care about this stuff 2) are happy to exchange their data for services as long as their friends are doing it too 3) don’t want to be burdened with becoming their own libertarian banker! Dave Winer might want to keep everything in an XML file on his desktop, but I know few others who, IRL, feel the same way.

Thus concludes my rough notes.

So, if Facebook were perceived as a big Data Bank in the sky, how would that change things? Would people demand the ability to “withdraw” their data? Does the metaphor confuse or clarify? In any case, what is the role of data banks and data brokers? Is there a difference if the data container leverages the data for their own benefit? If they sell advertising and don’t provide a clear or universal means to opt-out? And what’s in the way of making more “benevolent” data vaults a reality — or how do we at least bring the concept into the discussion?

I have no personal interest the concept, only that’s a viable alternative to the siloed approach is missing from the discussion. And going back to the business models of OpenID and other identity providers… well, if any, that’s it. It’s like having a credit card with access to no credit — what’s the point? And OpenID becomes more valuable the more data capital it has access to. Or something like that.

Oh, and I’d like to quote something poignant that Anders Conbere said to me today in chat:

I was talking with my friend the other day and I tried to explain to him, that what I fear about facebook that I don’t fear about pretty much any other vendor is it’s continued developement as a competing platform to the web. a locked in, proprietary version and what I see, is just like Microsoft leveraged Windows as a “platform for application developement” facebook is doing that for web developement. what it offers developers is the simplicity and security of a stable developement environment at the cost of inovation because as we’ve seen, as market share grows, the ability to inovate decreases (since your success is tied to the backwards compatibility of your platform) and I see the possibility of facebook becoming a dominant platform for web application developement which will in turn lead to two decades of stagnation

So yeah, put that in your bonnet and smoke it. Or whatever.

Data portability and thinking ahead to 2008

So-called data portability and data ownership is a hot topic of late, and with good reason: with all the talk of the opening of social networking sites and the loss of presumed privacy, there’s been a commensurate acknowledgment that the value is not in the portability of widgets (via OpenSocial et al) but instead, (as Tim O’Reilly eloquently put it) it’s the data, stupid!

Now, Doc’s call for action is well timed, as we near the close of 2007 and set our sights on 2008.

Earlier this year, ZDNet predicted that 2007 would be the year of OpenID, and for all intents and purposes, it has been, if only in that it put the concept of non-siloed user accounts on the map. We have a long way to go, to be sure, but with OpenID 2.0 around the corner, it’s only a matter of time before building user prisons goes out of fashion and building OpenID-based citizen-centric services becomes the norm.

Inspired by the fact that even Mitchell Baker of Mozilla is talking about Firefox’s role in the issue of data ownership (In 2008 … We find new ways to give people greater control over their online lives — access to data, control of data…), this is going to be issue that most defines 2008 — or at least the early part of the year. And frankly, we’re already off to a good start. So here are the things that I think fit into this picture and what needs to happen to push progress on this central issue:

Economic incentives and VRM: Doc is right to phrase the debate in terms of VRM. When it comes down to it, nothing’s going to change unless 1) customers refuse to play along anymore and demand change and 2) there’s increased economic benefit to companies that give back control to their customers versus those companies that continue to either restrict or abuse/sell out their customers’ data. Currently, this is a consumer rights battle, but since it’s being fought largely in Silicon Valley where the issues are understood technically while valuations are tied to the attractiveness a platform has to advertisers, consumers are at a great disadvantage since they can’t make a compelling economic case. And given that the government and most bureaucracy is fulled up with stakeholders who are hungry for more and more accurate and technologically-distilled demographic data, it’s unlikely that we could force the issue through the legal system, as has been approximated in places like Germany and the UK.
Reframing of privacy and access permissions: I’ve harped on this for awhile, but historic notions of privacy have been out-moded by modern realities. Those who do expect complete and utter control need to take a look at the up and coming generation and realize that, while it’s true that they, on a whole, don’t appreciate the value and sacredness of their privacy, and that they’re certainly more willing to exchange it for access to services or to simply dispense with it altogether and face the consequences later (eavesdroppers be damned!), their apathy indicates the uphill struggle we face in credibly making our case.
Times have changed. Privacy and our notions of it must adapt too. And that starts by developing the language to discuss these matters in a way that’s obvious and salient to those who are concerned about these issues. Simply demanding the protection of one’s privacy is now a hollow and unrealistic demand; now we should be talking about access, about permissions, about provenance, about review and about federation and delegation. It’s not until we deepen our understanding of the facets of identity, and of personal data and of personal profiles, tastestreams and newsfeeds that can begin to make headway on exploring the economic aspects of customer data and who should control it, have access to it, can create, read, update, and delete
Data portability and open/non-proprietary web standards and protocols: Since this is an area I’ve been involved in and am passionate about, I have some specific thoughts on this. For one thing, the technologies that I obsess over have data portability at their center: OpenID for identification and “hanging” data, microformats for marking it up, and OAuth for provisioning controlled access to said data… The development, adoption and implementation of this breed of technologies is paramount to demonstrating both the potential and need for a re-orientation of the way web services are built and deployed today. Without the deployment of these technologies and their cousins, we risk web-wide lock-in to vender-specific solutions like Facebook’s FBML or Google’s OpenSocial, greatly inhibiting the potential for market growth and innovation. And it’s not so much that these technologies are necessarily bad in and of themselves, but that they represent a grave shift away from the slower but less commercially-driven development of open and public-domained web standards. Consider me the frog in the luke warm water recognizing that things are starting to get warm in here.
Citizen-centric web services: The result of progress in these three topics is what I’m calling the “citizen-centric web”, where a citizen is anyone who inhabits the web, in some form or another. Citizen-centric web services, are, of course, services provided to those inhabitants. This notion is what I think is, and should, going to drive much of thinking in 2008 about how to build better citizen-centric web services, where individuals identify themselves to services, rather than recreating themselves and their so-called social-graph; where they can push and pull their data at their whim and fancy, and where such data is essentially “leased” out to various service providers on an as-needed basis, rather than on a once-and-for-all status using OAuth tokens and proxied delegation to trusted data providers; where citizens control not only who can contact them, but are able to express, in portable terms, a list of people or companies who cannot contact them or pitch ads to them, anywhere; where citizens are able to audit a comprehensive list of profile and behavior data that any company has on file about them and to be able to correct, edit or revoke that data; where “permission” has a universal, citizen-positive definition; where companies have to agree to a Creative Commons-style Terms of Access and Stewardship before being able to even look at a customer’s personal data; and that, perhaps most import to making all this happen, sound business models are developed that actually work with this new orientation, rather than in spite of it.

So, in grandiose terms I suppose, these are the issues that I’m pondering as 2008 approaches and as I ready myself for the challenges and battles that lie ahead. I think we’re making considerable progress on the technology side of things, though there’s always more to do. I think we need to make more progress on the language, economic, business and framing fronts, though. But, we’re making progress, and thankfully we’re having these conversations now and developing real solutions that will result in a more citizen-centric reality in the not too distant future.

If you’re interested in discussing these topics in depth, make it to the Internet Identity Workshop next week, where these topics are going to be front and center in what should be a pretty excellent meeting of the minds on these and related topics.