The human information filter

Sites like lead the way in Internet’s grand experiment in information routing

In last week’s column, I mentioned, Joshua Schachter’s “social bookmarking” service. Since then, I’ve explored the service more deeply in a series of blog entries. Using, I’m now able to process information in dramatically more efficient ways. Let’s look at some of the reasons why.

For starters, is a machine-independent way to store bookmarks. From any Web page, you can use a bookmarklet to post the page’s URL, title, description, and a set of keywords or tags. From any computer, you can then recover the page by searching for text in the title or description or by navigating to it using one of its tags.

Dumping your own information into a service is always a concern. What if the service goes belly-up? You need an exit strategy, and provides exactly the right kind. A simple URL retrieves all your posts as an XML file. I now run a scheduled daily fetch of that URL, so that everything I add to is backed up locally.

A clean exit strategy is obviously desirable. Less obvious but equally crucial is a robust entry strategy. How easily can you import your own data into the service? The test case here was an XML file with hundreds of my blog entries. Thanks to the simplicity of’ API, which is similar to REST (representational state transfer), it passed the test with flying colors. After tagging the entries with keywords, I transformed the file into the set of URLs needed to populate my slice of the namespace. Suddenly, my blog entries and InfoWorld columns became navigable in a new and powerful way.

Of course, most blogging systems support categorized browsing. But I quit using my blog that way because I wasn’t interested in building a private taxonomy. A tag in is really a topic in a publish/subscribe network. When I assign a tag to an item, I’m routing the item to a topic. Anyone who subscribes to that topic using its RSS feed can monitor the items flowing to it.

If anyone can publish to a topic, won’t the signal-to-noise ratio degrade? Yes, but has another ace up its sleeve. For a given topic, you could subscribe to all items, but you might rather subscribe to postings only from people whose views on that topic you trust. On the topic of social software, for example, Clay Shirky and Sébastien Paquet are two observers who would make excellent filters.

In a March 2003 column, I wrote about the challenges of doing publish/subscribe at Internet scale. David Rosenblum, who was then CTO of messaging startup PreCache, had described to me an optimization procedure he called “filter merging.” The architecture of lends itself to just that kind of optimization. The combination of several trusted human filters, with respect to some topic of interest, yields a powerful merged filter.

Nothing about is rocket science. A competent developer could re-create the service in short order. And that’s one of its greatest strengths. We’re all becoming information routers, but we’re still discovering how the process needs to work. To do the experiment, we’ll need flexible and lightweight systems that are easy to implement, join, use, and build on. Joshua Schachter has shown how to build the right kind of laboratory.