I’d never received an Open Letter until Alan Morrison posted one earlier today in response to an interview I gave to Straight.com about microformats and the (lowercase) semantic web. For the sake of completeness, here’s what he wrote:
Chris, judging from your interview in Straight.com, you seem like a thoughtful guy. But you don’t seem to understand that the Microformat and Semantic Web folks aren’t that far apart. You cite the prevalence of non-standard HTML to support your contention that we’ll never use ontologies. But in the same article, you say the comic book store you frequent has its own iPhone app. So people can write their own iPhone apps (or at least have friends write apps for them), but they can’t put together their own ontologies?
Simple tagging has obvious benefits–just look at popularity of folksonomies. I don’t disagree with you at all there. But one of the advantages of the RDF/RDFS/OWL family of standards is that it’s a metadata umbrella–it can make use of various kinds of metadata, and then add to these. But it certainly helps if the metadata are consistent.
The big advantage of RDF, which you seem to miss entirely, is that it’s a data model that improves on RDBMSes from a data integration standpoint. It’s a data model truly designed for the Web. Have you thought about this at all from the data model level?
I’m not a religious zealot when it comes to standards. Microformats sounds as reasonable as RDFa to me, except that the former have no infrastructure underneath them and aren’t consistent.
PwC devoted an entire issue of its Tech Forecast to describing the necessity for this infrastructure and how companies are now using the one the W3C’s developed. If you read this, it might fill in some knowledge gaps for you. It does seem to make good sense for you to build on what others have started, even if you quibble with bits and pieces of it.
I responded to his post with the following comment:
Thanks Alan. I’m happy to take all criticism, corrections and feedback on my perspective. I certainly don’t think that I have all the answers, but I do try to be pragmatic.
I think that I do understand the value of RDF — in theory — but in my world — the social web — I’ve seen very few success stories, or examples in the wild, where RDF and its sibling technologies have made anything demonstrably easier or more ubiquitous. I’ve had the praises of RDF et al sung to me for many years, and yet I consistently see companies large and small run for the hills when it’s mentioned.
Meanwhile, microformats have seen much wider adoption in the wild on the open web — not least of which came in recent successes as Microsoft, Google and Yahoo all have shipped products that leverage various microformats (imperfect though they are, they work with the HTML-based web that people know how to develop for).
Now, I do think that there are success stories out there for RDF et al… namely in the medical and pharmaceutical industries. But what I’ve heard is that those companies are loathe to share the fruits of their labor with the wider community, resulting in non-interoperable ontologies. I thought interoperability was the whole point!
As with most of the things I work on, I can be convinced of most anything if you can demonstrate successes that make sense to me and that resonate beyond me — to unfamiliar audiences. Part of the work of a designer-slash-web-evangelist is listening to the problems that people are experiencing, synthesize what they’re saying, and then putting together the people who are all having the same issues [so that they can collaborate on solutions].
Outside of academic circles, I’ve just not seen the kind of human-scale successes that convince me that the world at large is ready to contemplate the intricacies of getting involved with the semantic web. I’d love to be proven wrong here, so if you have examples to the contrary (besides arguments), I’d be happy to check them out!
So, am I wrong or misguided? I’m waiting for the social network that’s built on RDF that my mom will use, but I’ve just not seen it yet! (And yes, she is on Facebook now!).
Also, by “human-scale”, what I mean is technology that can be authored at the level of the individual — with little depth of learning. HTML is what I would consider “human-scale”, since a lot of people figure out how to write it without formal computer science training. Microformats nestle nicely into HTML writing skills, and so I consider them human-scale.
18 Comments
Chris, you haven’t missed anything, if anything you’ve been very gentle about how you’ve made the case for microformats.
This isn’t the first (and I doubt it will be the last) time that the mistaken claim of “have no infrastructure underneath them and aren’t consistent.” has been made. Usually the folks that have made this claim haven’t even bothered to take a look at XMDP profiles.
http://gmpg.org/xmdp/description
though the use of which, including ids on each property, every microformats property has or can have a URI.
Basically, for those that care, XMDPs for microformats are kind of like DTDs for HTML. If you care about such things, you can include them, and the infrastructure that comes with them.
The claims of “aren’t consistent” are rarely backed up either. Anyone who thinks they’ve found inconsistencies are welcome to add them to the respective “issues” page on the microformats wiki, e.g. the page for hCard issues is:
http://microformats.org/wiki/hcard-issues
Many issues have been added and resolved ( http://microformats.org/wiki/hcard-issues-resolved ) and I’m working on updating hCard accordingly as just one example.
Your point of microformats being “human scale” is precisely correct.
With microformats we’re focusing on helping *humans* representing information for and by *humans*, rather than “a data model that improves on RDBMSes” – for all practical purposes, the vast majority of those who author and develop for the web don’t care (and shouldn’t have to care) about improving a “data model”. For example, the information architecture (IA) of a website which affects how users fundamentally perceive and understand the information conveyed, is far more important than any internal data model.
A generic data model truly designed for the Web would be designed for the *humans* who author and develop for the web, for representing information about and by *humans*, and would model the ways such information is already being published and presented on the web, again, by *humans*.
I’ve yet to see such a “generic data model” (note: anything with namespaces = non-starter for humans who author for the web).
microformats (as currently specified) don’t seek to be such a generic data model either.
However, perhaps with microformats we can represent some of the common forms of data on the Web that humans are frequently publishing, presenting, and sharing (such as people, relationships, events, reviews, etc.) and therefore provide immediate usefulness by solving real problems and use cases.
And perhaps with the real world experience and understanding gained by solving such real world problems, we will gain the necessary wisdom to abstract from those solutions and develop a data model designed in a human-friendly way for the Web.
In the history of science, many fields benefited a great deal when a new generation discarded “what others have started” and began at first principles, rebuilding along the way. Lately (and especially after hearing @hober rant about many of the same things you mention) I wonder if that’s something the web might benefit from.
I wonder what markup would look like if it were invented today…from scratch. It’s a great thought experiment.
Chris,
I guess in fairness, the recent Google success for microformats is also a success for RDF(a) – but I do very much agree that the baby-steps approach of microformats means adoption is far more likely. and definitely an easier sell!
David,
from a pragmatic perspective, technologies have considerably greater vested interests than scientific theories. With science, you have reputations. And on the obverse if anything textbook publishers love a good paradigm shift – more books to sell
With technologies, you have reputations, *plus* the investment of tool developers, browser developers, the implementors themselves (in this case web developers etc), and many others. That make technological paradigm shifts considerably harder than scientific ones.
Successful standards start simple… and microformats are just that – I can add a sprinkle of microformats to my website without changing any infrastructure and provide immediate, often significant value. RDF on the other hand quickly gets fugly and unusable.
A particularly interesting application I recently saw of microformats was the blending of the human and programmable webs in “RESTful Web Services”, where both people and computers used HTML marked up with microformats. I’m not entirely convinced by this approach, but it’s certainly innovative.
Anyway, it could be that you’re both right… microformats scratch an itch now and RDF provides a heavier, long term solution for more complex problems.
Sam
Hi Chris,
You may be in the unique position to lead us to drop this false dichotomy. Microformats and RDF are different tools for different jobs. It is irrelevant now that microformats were born in opposition to RDF and positioned as the scrappy contender. No one needs to convince you that RDF has a role nor vice versa. Every application designer can do the research and use the applicable technologies.
So, please, declare the “debate” over.
Thanks,
Rick
For the last few years I have enthusiastically followed the microformat project of Tantek and yourself. Like you I have evangelized the simplicity and value of microformats especially given the fact I use the operator plugin for FF to detect and occassionally reuse this microformat metadata. i.e Twitter supports microformats and the operator lets me reuse some of the data in other apps. But the activty and noise around microformats appears to have diminished of late.
Like you I have tried to follow the W3C work on RDF, OWL and SPARQL and I have to admit I have found it difficult to grasp and without examples in the wild even harder to justify the time to conquer the steep learning curve. That said given the recent announcements from Google (RDFa support) and Yahoo (SearchMonkey), I am once again trying to grasp the complexity in order to better understand the future value of LinkedData.
The good news is real world semantic web examples are beginning to appear in products like Twine, Glue, Zemanta, OpenCalais and FreeBase which make it easier to understand how the semantic web might work for me in the future and gives me hope that the time I might invest now in learning semantic technologies may not be totally wasted or surplanted by some new semantic technology/syntax.
That said I too recently came across a conundrum in regard to RDF(a) and microformats proving the lack of knowledge I currently have because I still have no answer.
Like you and many others I am a user of twitter. I am also an advocate of XMPP (push real-time) and Atom XML datastreams. So I live in hope that Twitter will change its model back to an Atom Data Stream pushed over XMPP in order to deliver genuine realtime data. Not the psuedo HTTP based pull model they use today.
If that happened I then began wondering what the metadata model was for this Atomstream. After some research it was clear that it certainly wasn’t going to be RDF (this example here shows how to turn Twitter Atom feeds into an RDF Store along with triples http://www.devx.com/semantic/Article/40869) or microformats. (Even if the twitter client already supports hAtom, hCard and XFN.)
I then began to think that the way you were developing ActivityStreams with Nouns, Verbs and Objects was very similar to RDF triples and could possibly be the metadata layer to Atom inside an activitystream, enabling apps to extract metadata about the activitystream. I hope this is how you thinking as the way forward as a new approach to metadata.
Additionally I think the ActivityStreams Atom extension can also be developed to meet your goals for picoformats. Given the limitations of 140 microblogging adding extra syntax to the tweet msg could prove a constraint unless twitter enabled the extra 20 characters they hold in reserve (160 characters) for the use of picoformat syntax.
Instead I would prefer to see the ActivityStream format develop further to provide the metadata layer especially given the way Google gWave and Laconica is developing (Federated XMPP and Atom XML).
The question is will ActivityStreams borrow more from the RDF syntax or from the microformat syntax?
David, I wonder what examples you have in mind of those starting over points in the history of science? My understanding is that where one paradigm has replaced another, it’s done so with gradual generational shift rather than a collective agreement to set aside one way of thinking for another.
The web’s nature doesn’t seem to afford clean breaks, and it may be that that nature is required for the kind of success its seen. In order for there to be uptake and innovation within it, a framework seems to need a certain amount of flexibility. At the risk of misreading the history, comparing SGML with HTML with XML there’s a trend of relaxing ontological rigidity and a corresponding trend of uptake and innovation. It may be that to get total buy-in to one way of describing resources is to reach a point of stagnation.
You say that there are no social networking sites “built on RDF”, but it could also be said that there are none “built on microformats” either.
Sure, quite a few publish hCard or XFN, but none use it as the foundation for their core offerings. If the hCard and XFN class/rel values were stripped from their markup, the sites would not cease to function. hCard and XFN is just part of their HTML output template and could be removed on a whim.
Even for those few social sites that do support hCard/XFN import, it’s going to be a second tier input mechanism compared to manual entry of contact details / friends.
Chris,
I thought you guided good. The most weak point of semantic web technology is its difficulty to normal web developers still. It’s one of reason for them to prefer REST or Javascript rather than SOAP in Web API. I thought the same rule was applied in competition between microformat and RDFa which it’s easier than others still.
Front-end developers to treat “view” area are not familiar with complexity of logic and meta-data in real world. Microformat also can be harmonized with HTML itself in various aspects. BTW microformat was one of solution for question how to make a name in “class” or “id” attributes.
Chris, did you see the news that Tim Berners-Lee’s been appointed by UK MP Gordon Brown to advise on open data? http://www.vnunet.com/computing/news/2243867/berners-lee-open-government. He’s already tapping W3C folks to help him: http://twitter.com/NovakKevin. Agree @Rick Thomas that the debate is over. From a big picture POV, this is not an either-or issue. @Sean McBride could be making a good point, but the main W3C Sem Web standards have momentum, and they do for a reason. The issue is simply bigger than the territory Microformats have staked a claim to. I’ve taken my post down.
Sam Johnston: pointing to an OWL file generated by Protegé in RDF/XML as an argument that RDF *as a whole* is ugly is a bit like pointing to a minified JavaScript library in order to say that JavaScript as a whole is ugly. Almost everyone involved in RDF these days sees RDF/XML as just a transmission format – nobody is supposed to read that by hand. You’d convert it to Turtle or – in the case of an OWL ontology – view it in something like Protegé. And I don’t need telling how ugly RDF/XML is – I’ve written an RDF/XML parser (and am trying to find the time to write another).
And even if that RDF/XML is ugly, there’s one thing which should be very pleasing to the eye when reading that: that it contains a ton of URIs. URIs mean extensibility. Extensibility means not having to live with broken design decisions of yester-year. Extensibility means being able to replace only the broken bits while keeping the good bits. And it means you can take good designs and add new stuff experimentally. Vocabularies for expressing data need to be able to be designed by individuals and communities on their own, in much the same way that websites can be.
The URI is the thing. What makes URIs different is that unlike any other data format, you can follow them to find out more. You can’t do that with ISBNs or DOIs, usernames, foreign keys in relational databases. RDF is really the intuition that foreign keys need to be URIs, not integers. You can’t stick an integer or a username into Firefox or Safari. If I get some weird bunch of XML or JSON come down the pipe, I need to know in advance what makes it tick. Sometimes that’s obvious, but a lot of the time it isn’t. Seperating the syntax and the semantics is a good first step to solving that problem.
Tom: I’m just saying it how I see it as a Semantic (big “S”) outsider… sure it’s probably the long term solution but microformats et al are far more approachable today.
Sam
Being on the “outside” of the Microformats group, “Microformats” seems to define a project / group of people more than a platform that anyone can build on in a wholesale manner.
An individual format, like hCard is something that people can build on–and hCard is a good thing. But it seems uncritical to say that it’s a Microformat and therefore more / all Microformats are a good thing. Unless you are suggesting that “Microformats” define something independent of any group of people–e.g., it’s just the common idea of using class names to make it easier to find data embedded in HTML, in which case I am misunderstanding your point.
In contrast, if RDFa ever proves itself to be a good thing, it really is a “thing” in that sense of being a platform technology or technique for any kind of content / data, that any individuals can choose to build on independently. RDFa’s semantic is another generic one like HTML, rather than content specific ones like hCard or hAtom.
I do think it’s fair to say that hCard is useful and “so what” if you can do RDFa-vCard since people are already using hCard. Those are content specific arguments. And, a most of the important “data” needs people have can be defined in content-specific terms.
So, it’s great that the Microformat group is focused on providing content-specific semantics, as opposed to general ones.
But, it’s also a good thing that RDFa is focused on a generic semantic, rather than content-specific ones. It offers other opportunities / different ways to approach the unending challenges of managing content / data.
With both RDFa and Microformats, the real measure isn’t the number of pages that have those formats built-in (especially when many of those pages are really just a single template being used over and over): the real measure is the number of interesting consumers of those formats. And, generic consumers (e.g., for RDFa) have totally different strengths and weakness against content-specific consumers (e.g. hCard, RDFa-vCard). People, in the general case, likewise need the flexibility to choose between those approaches.
Neither wrong or misguided.
This comment was originally posted on FriendFeed
I really love the term "human-scale", totally agree with the possible adaption rate. And that drives the future.
This comment was originally posted on FriendFeed
The belief that ultra-minimal and human-friendly microformats can’t have a consistent underlying infrastructure is misguided. One can be terse and logically consistent in designing semantic markup languages. RDFa is much too cluttered. Rebuild from the ground up.
This comment was originally posted on FriendFeed
Enjoyed Chris’s thoughtful response & defense of microformats: http://tr.im/fj_rdf_mf via @chrismessina #rdf #semanticweb #debate
This comment was originally posted on Twitter
While the #html5 vs microdata fight rages, there’s a gentlemanly (for now) #microformats vs RDF punch-up going on http://bit.ly/IrnLo
This comment was originally posted on Twitter