Free Newsletters
Technology & Business Daily

InfoWorld
Log-in | Register
STRATEGIC DEVELOPER  

Paving the information footpaths

Information systems should adapt to our usage patterns, but making that happen is no easy task

By Jon Udell  
May 04, 2005
 

I’m sure there are dozens of versions of this story, but I heard it from Larry Wall, the father of Perl, and it goes like this: Instead of laying down sidewalks, the builders of a new university campus waited for footpaths to emerge on the lawns. Then they paved the footpaths. Larry designed Perl around this idea of structure emerging from use, but that was an unusual case. We typically lay down the sidewalks first, and when footpaths emerge we profess surprise or try to ignore them.

Free IT resource

Open Source Business Conference (OSBC) May 22-23, 2007

Sponsored by OSBC

Free IT resource

Virtualization Insights from Top Experts - Learn how virtualization gets real!

Sponsored by Dell

I recently learned, for example, that developers have found an unexpected use for the new XML data type in SQL Server 2005 (code-named Yukon). Although Yukon embeds the .Net CLR (Common Language Runtime) and stores CLR objects in database columns, people have also been serializing CLR object data and storing it as XML. It’s true that there’s an 8KB limit on stored CLR objects, but that’s not the only reason folks are coloring outside the lines. They’re also escaping constraints that make it hard to evolve data structures in response to patterns of use.

Another example comes from a recent conversation with John Schneider of AgileDelta, who analyzed message traffic flowing through military systems during a four-year period. Although the messages were in theory governed by schemas, in practice nearly all of them extended or deviated from those schemas. Of course that didn’t prevent soldiers from calling in air strikes. A system that simply failed on receipt of an invalid message could not survive in the fog of war.

We see this kind of example everywhere on the Web. Many if not most Web pages are malformed. If browsers had required correct HTML, the Web would have been stillborn. Similarly, RSS, arguably the most popular application of XML, has no schema. If XML parsers had required schema validation in addition to well-formedness, the blogosphere never would have emerged.

We’re learning to tolerate these sloppy practices, and even to appreciate them, but we haven’t really begun to work shoulder-to-shoulder with them. Here are two strategies for doing that: opportunistic enhancement and statistical classification.

My recent XQuery experiments illustrate the idea of opportunistic enhancement. Virtually all the blogs I read can be converted from HTML to XHTML. In that form they can be pumped into an XML database and be queried in a structured way. If the structure implicit in content is revealed and made useful, we might kick off a virtuous cycle.

The social tagging systems are a laboratory in which techniques of statistical classification will be explored. As Clay Shirky has pointed out, the terms “movies,” “film,” and “cinema” are not just synonyms; they encode real cultural differences. A taxonomy that stamps out those differences won’t serve the various constituencies. We can still build systems around taxonomies, but we have to let the footpaths emerge, and in this realm they’re just fuzzy statistical traces.

It’s easy to criticize information systems that fail to embrace sloppiness. It’s much harder to explain how they should embrace it. Sloppiness is only a means to an end. In order to make things work and get things done, we need to codify patterns of use. It’s a catch-22, though. The right patterns don’t emerge from systems that people won’t use. How we reconcile specification with emergence isn’t an engineering discipline, but it probably should be.





 


 
Jon Udell is lead analyst and blogger in chief at the InfoWorld Test Center.

  More of Jon Udell's column
  Jon Udell's Weblog

Newsletter Check out all of our free newsletters!
Enter e-mail address:




 

TOP NEWS:


»  Four quick tips for choosing an IM security product
71 percent of businesses will invest in real-time messaging this year. If you're one of them, be sure to protect your enterprise

»  Forrester analysts ID hot IT jobs
Research group finds 16 IT roles with a promising future

»  Nvidia claims 10 hours of HD video on Tegra chip
The Tegra 600 and 650 can be used with hard disk drives and are designed partly for mobile Internet devices

»  Database vendors add Google's MapReduce
Greenplum and Aster Data Systems will support Google's programming technique, developed for parallel processing of large data sets across commodity hardware

»  Network management: Tips for managing costs
New technologies, changing requirements, and ongoing equipment maintenance and upgrades cost money, but there are ways to manage expenses

»  EMC targets SMBs, branch offices with new low-end storage
Celerra NX4 highlights include thin provisioning, snapshot technology for data recovery and backups, and Web-based console for management of storage volumes




Application Grid: Oracle's Vision for Next-Generation Application Servers and Infrastructure
View this live Webcast to hear senior Oracle executives Hasan Rizvi and Steve Harris discuss the application grid. Learn how Oracle is combining cutting-edge technologies from its recent acquisition of BEA with the Fusion Middleware portfolio. Discover a new level of reliability, performance, and "scale-agility" in your data center, with emphasis on efficiency for today's challenging economic environment. Sponsored by Oracle

»  Click here to view this Webcast
  Virtualization Solutions Guide
This comprehensive IT Strategy Guide covers Virtualization and puts you at the forefront of the discussion. You'll learn all you need to know from the cost of virtualization, how to implement it for your business, how to back it up safely and which products are best. Sponsored by Riverbed

»  Click here to download now

- Special Advertising Partners -
WHITE PAPERS
 

» Technology White Papers Library

Technology White Papers by Topic

Technology White Papers E-mail Alert

Find out when the latest white paper is available:
 
 
INFOWORLD MARKETPLACE
 
» BUY A LINK NOW
 

FIND PRODUCTS AND COMPANIES
» COMPLETE PRODUCT GUIDE



TECHNOLOGY INDEX
• Applications
• Application Development
• Security
• Networking
• Wireless
• Platforms
• Hardware
• Data Management
• Storage
• Web Services
• Business
• Telecom
• Professional Services
• Standards

TECH WATCH 


What's the 411 on GOOG-411?
Just as Google has become synonymous with "performing a Web search," 411 is understood to mean "information" -- as in "what's the 411?" I was thus surprised to discover, from a billboard, no less, that the king of search is taking on the ...

Apple HTML source reveals 'iPhone Extreme'
"This one's a stretch..." reports AppleInsider. Um, yeah. Reporting on HTML code sightings of product names could be called a stretch, but iPhone Extreme has a ring to it. Now, that sounds like the product Apple should have released first, rather ...

COLUMNISTS

Unified under law
Ephraim Schwartz's Column and Blog (InfoWorld) - In the litigious world we live in, deploying a unified communications platform in your enterprise could...
» MORE COLUMNISTS

MORE INFOWORLD BLOGS


Open Sources 
Product Management
When I joined MySQL four years ago, there was quite a lot of debate about product management. We didn't actually have ...

Zero Day 
Botnet herders tending smaller flocks
New research backs up the theory that botnet operators are keeping their networks smaller in a continued effort to keep ...



• Advice Line
• Database Underground
• The Deep End
• Enterprise Mac
• Geeks in Paradise
• Grid Meter
• The Gripe Line
• InfoWorld Daily
• Inside IT
• IT Troubleshooter
• ITXtreme
• Open Sources
• ProdBlog
• Real World SOA
• Reality Check
• Security Adviser
• SMB IT
• The Storage Network
• Tech Watch
• Virtualization Report
• Zero Day

ADVERTISEMENT


RESOURCE CENTERadvertisement 

GOVERNMENT IT & POLICY
'If you don't go after the network, you're never going to stop these guys. Never.'
From the State Department, All the News for Inquiring Minds
TechPresident, the Internet Citizenry's New Consensus Taker



Sponsored Technology Links

 
 
 HOME  NEWS  BLOGS  PODCASTS  VIDEOS  TECHNOLOGIES  TEST CENTER  EVENTS  CAREERS   About | Advertise | Awards | RSS | Contact Us 

Copyright © 2008, Reprints, Permissions, Licensing, IDG Network, Privacy Policy, Terms of Service.
All Rights reserved. InfoWorld is a leading publisher of technology information and product reviews on topics including viruses,
phishing, worms, firewalls, security, servers, storage, networking, wireless, databases, and web services.

CIO :: ComputerWorld :: CSO :: Demo :: GamePro :: Games.net :: IDG Connect :: IDG World Expo
Industry Standard :: IT World :: JavaWorld :: LinuxWorld :: MacUser :: Macworld :: Network World :: PC World :: Playlist