Technology – Page 24

PimpMyHTML

The Multipack are throwing a Zen Garden-esque contest called PimpMyHTML. They really should have called it PimpMyMicroformats given that they’re using the follow microformats:

The rules are pretty straight forward, and as I’m a fan of constraints and AJAX+CSS+XHTML, I’m looking forward to the entries:

You can’t change the HTML at all.
Use Best Practices of both CSS and JavaScript.
Limited to a maximum of 20kb of JavaScript.
No more than 5 image files can be used. (If you use your noggin, this isn’t much of a problem.)

And as you’re limited in filesize, it might be worth your while to go check out the forthcoming Mootools and read up on some ideas on exploiting CSS instead of using JavaScript. And while you’re at it, Molly has two articles on microformats definitely worth a gander.

OmniWeb 5.5 out, based on WebKit

Moving off of outmoded WebCore, The Omni Group has come out with OmniWeb 5.5, based on WebKit. In my tests, it still has some bugginess loading certain websites, but on the whole, it’s a solid browser that I find using as my trusty secondary (after Camino, of course).

This release is important because it adds yet another to the growing stable of WebKit browsers in the wild.

It also sets a number or precedents with regards to visual tabs, customizing your surfing experience web-wide or at the individual site level and adds RSS subscriptions to its standard feature set. It’s still not the perfect browser, but it’s certainly a contender (though with a $30 price tag, I’m not sure anyone but Omni fans will be willing to ante up with so many decent competitors out there).

—

Have you ever danced with your software?

If Windows, Linux, Ubuntu or Mac OSX were dance partners, how would they dance? Would they lead, or would you?

More importantly, would you accept a second dance if any one of them offered?

Hyperscope and the future of the past

Photo by John Lester.

I can’t quite tell how significant this is, but I know that it’s been a long time coming and that, only over time, will we begin to understand what this system will really mean for information systems.

In classic understated flair, Doug, Eugene and Brad will be releasing the Web 2.0 version of Doug Engelbart’s Hyperscope to the world tonight.

It’s hard for to describe succinctly, but basically it’s taking hypertext and adding the “hyper” to it (today’s web linking is kind of like the Model-T compared to Engelbart’s space age original 1968 vision). You’ve really got to try it for yourself to see what I mean; what at first seems like a big outline (it’s cleverly built on top of OPML) quickly becomes an immersive experience that other system pale in depth and flexibility to.

In some respects, this kind of learnable system is what I was talking about in my post on learning from game design. The only presumption, or goal, of the Hyperscope system is that you’re interested in working with knowledge and information — how you go about finding, linking to, appending or operating on that information is up to you.

All that and it’s built on Alex Russell’s Dojo Toolkit is an achievement in open source cross pollination that should be also be duly recognized.

Congrats guys.

Patents: the tar pits of modernity

Photo © copyright Adam Loeffler.
I don’t understand why someone hasn’t patented the patent process and shut down the whole racket. There’s nothing that inspires more fear, has created more anger and resentment and held back innovation in the POMO world (thanks Dave!) more than the US Intellectual Property system — and most notoriously copyright and patents.

Now, I’m not an intellectual property communist — far from it. In fact, I’m very much about people getting credit for their work, for their inventions, their ideas and in due time, compensation — both economic and social.

But the system is effed. And as there are alternatives to copyright and trademark, there similarly needs to be an open source alternative to patents, that allows the creative and ingenious to receive credit and kudos without creating a chilling effect on future and subsequent derivative innovation — innovation that has historically been built on borrowed and hacked ideas. Innovation necessary for human progress to continue at the pace it’s at today.

It’s bad enough that sentient creatures will look back centuries — if not decades — from now and laugh at how us humans smogged ourselves to death. Oh no, they’ll also barrel over in hysterics at how we held back our creatives by denying them the freedom to dabble freely and openly without the both fear of being blatantly ripped off as well as slapped with a law suit for violating someone else’s property rights. “What a bunch of cheap trust they had back then”, they’ll quip. “it’s a wonder that the little guys continued to play along even after the whole balance had shifted away from protecting them to protecting the overbloated incumbents!”

I mean, how else can you explain this quote from Christopher Lunt, formerly Friendster’s senior director of engineering (recently made refamous for their social networking patent)?

“My approach was defensive,” he said. “We were not looking to stifle creativity by competitors, nor to make money by licensing. We were making sure that things material to our business were protected, so someone else couldn’t claim the idea.”

“I dislike the current patent process,” Lunt added. “I feel it’s a little too permissive in terms of what is granted as a patent. But that doesn’t mean I can ignore it.”

Gah!! What a waste! Of money! Of talent! Of time! To have to register defensive patents is bullshit. The answer, quite simply, is something more proactive… more positive… duh! It’s open patent licensing! And why our legal system hasn’t codified this yet, well, that’s because you don’t make money off of open systems — you make money because of openness. And that’s something that our legal system, at least the purveyors of the modern legal system, could give a rat’s patootie about. It’s far too subtle. Kind of like that boiling frog in Al’s movie… or the dinosaurs paying no heed as the weather was getting colder… before the Ice Age. Or as the black ground started rising above their shins… drowning in the refuse of their own ancestor’s remains.

Fighting spam: Call in Akismet!

Original photo courtesy of Rich Legg. Used with permission.

It’s painful to watch the many approximate pattern-based spam-fighting attempts that come up from time to time that we all know will eventually be made obsolete. Ultimately such tricks will only end up leading to more time spent weeding out false positives while the spammers stay ahead of the curve (it is their business, after all).

So not long ago, I started dumping an external catch-all account into Gmail. Since I use a new email address with every account and new beta that I sign up for (in order to catch offenders who leak my data — GoDaddy being the worst as domain registration records are public unless you pay), I started getting blasted with spam sent to randomly generated addresses.

Initially Gmail did an incredible job catching the spam; since I’ve been using this technique for the past two months, Gmail has easily caught over 250,000 spam messages.

Now, that’s not to say it’s perfect. In fact, especially lately, far from it. Though Gmail is in the unique position to harvest email from across its entire user-base and adapt its algorithm instantly the moment one of its accounts gets hit, it still can’t hit everything 100%. So, even as this is one of the biggest advantages of using a hosted email service like Gmail, it still lets more spam through than I’d like.

As far as I know, Google does not exchange spam data with other email providers (though maybe it does, I’m not sure). Whatever the case, I’m always interested in diverse tactics to dealing with spam. And given the success I’ve found with Automattic’s spam-squashing Akismet plugin on my blog, I wonder if this technology couldn’t be adapted for email?

In particular, I think that early adopters suffer from a different kind of spam abuse than most. That’s only a hunch, but I think that we make ourselves more vulnerable, especially in case of using catch-all accounts (a cardinal sin of spam management, from what I hear).

Perhaps the application of Akismet to the early-adopter spam problem could act as an additional networked preventative measure, leveraging spam trends across all email platforms, just as Akismet is starting to do for blogging platforms.

I dunno, I’m not an expert in this domain, but Akismet is one of the most promising instances of spam fighting and prevention that I’ve found and I’d love to have the same piece of mind in my email that it affords me on my blog. Could we give an Akismet-bot POP3 access to Gmail and let it loose? Better yet, could we run Akismet client-side as a Greasemonkey or Firefox extension? Again, the details probably aren’t as important as the results.

So, Matt, what’d it take to sik Akismet on my email?

On open letter to Blogger

With Blogger in the throes of a new beta cycle, it seems the ideal moment to get microformats support into one of the more popular blogging platforms on the web.

With that goal in mind, I sent an open request letter to the Blogger-Help discussion group. No responses yet, but if you’re interested in seeing this happen, please follow up in whatever way you think might be most effective… tanks!

Hello,

Not sure to whom I should address this request, but I’m very excited about the Blogger Beta and that it represents an open opportunity to add support for microformatted content.

You can read more about microformats at microformats.org, but to summarize, microformats are community-developed standards for identifying certain kinds of information in webpages using your typical HTML tags and classes.

In particular, this is my wishlist of microformats that I would love to see Blogger support:

rel-tag: okay, you already took care of this one, so kudos!

XFN: WordPress already supports this, and it’s especially useful for representing lists of friends in blogrolls.

rel-me: from the XFN family, being able to link to other pages on the web using rel=”me” creates an informal means of “claiming” other places where I publish online. Read about Ma.gnolia’s addition of rel-me.

hCard: marking up personal profiles in hcard means that if I add personal contact details, people can click a link to add me to their address book without any extra typing. I’ve done this on my main blog. Clicking the “Add me to your address book” link will convert the HTML content in that page into a .vcf file that most address book programs can recognize.

hCalendar: In order to make it easy for my readers to add events that I’ve blogged about to their calendars (Google Calendar or others, like iCal), I can use hcalendar to mark up this information with a link to add the events to their calendar. Here’s an example.

hAtom: This one is fairly simple to implement since you’re already classing most of this information already. hAtom uses element names from Atom as class names. This allows people to subscribe to blogs directly, without the need to subscribe to RSS. You can read more about this.

Though the benefits may not seem immediately obvious to supporting microformats, the amount of effort required to add support is fairly minimal compared with other, more substantial features that you’re probably already working on. Furthermore, our community would be happy to help with the process of adding support to Blogger, validating your work and providing guidance along the way. This initiative is also not a commercial effort; rather, it represents the work of a large, distributed, worldwide community that wants to build out the value of the “lowercase semantic web” and to make data storage in web pages a reality.

In some respects, we are at a chicken-and-egg crossroads but the more support that we see for microformats in the wild, the more tool makers, publishers, browsers and other applications will reap the benefits of this effort to essentially modernize the web, incrementally building upon the existing infrastructure.

Thanks for your consideration and please let me know if there is any way that I can be of service.

Chris

Google Image Labeler relies on crowdshop labor

Folks are buzzing about Google’s new ~~time wasting~~ playable Image Labeler. Philipp Lenssen says:

More than a game, for Google this is a way to tag images using human brain power… to improve their image search results. Two people finding the same tag can serve as validation the tag makes sense. I suppose for Google it’s not important that two people find the same keywords at the same time – they can simply let people tag the images and then add any threshold they want (like “4 people must have chosen this tag for it to become a confirmed tag”).

Both Search Engine Watch and TechCrunch made the connection to research conducted by Luis von Ahn at my alma matter that was first blogged about as early as December last year (written up in the Pittsbrugh Post Gazette in August 2005).

According to Danny Sullivan at Search Engine Watch, the Google technology is indeed based on von Ahn’s work:

Yes, Image Labeler is based on my ESP Game, which Google licensed. I’m not employed by Google, however, since I’m a full-time faculty member at Carnegie Mellon.

In my experience, I found the images were often too small to make out clearly, whereas in similar systems like Amazon’s Mechanical Turk, you get much higher resolution photos.

Interestingly, Riya uses a similar but closed system of human tagging to populate its object search. It’s unclear how such a system scales for web wide results unless something like Google or Amazon’s tool find enough widespread pick-up and open up an API to the tagged images.

CrossOver Beta brings PC apps to the Dock

My open source buddy-slash-analyst Raven Zachary (who also brought me news of the Green Phone) pinged me to tell me that CodeWeavers have launched the beta of CrossOver for the Mac, a full WINE environment port to OSX that lets you run Windows apps without… Windows! (…unlike Parallels which is a virtual machine.)

I wrote about this idea in July and it appears that the reality of OSX subsuming Windows is coming ever-closer.

Though many of the folks whoa are most excited about this are gamers, Raven’s screenshot proves how valuable and convenient this will be for Mac web developers who have been locked out of native Internet Explorer testing.

Oh, and pre-ordering saves you $20, gets you 3 months of extra service and a free upgrade to CrossOver Mac 6.0 (just an FYI).

Open source OCR in the wild

About as sexy as an eye exam, but damn, this technology is difficult to get right. So yesterday Google announced the open sourcing of Tesseract OCR, character/text-recognition software it developed back in the 80’s that it claims is better than most of the open source alternatives (I’d believe that) but not quite as good as some of the commercially available technologies (I’d buy that too).

But hmm, isn’t there a lot that could be done with this? Personally, can’t wait until we see this make it’s way into OpenOffice among other places.