March 24, 2009

Slacker databases break all the old rules

Amazon SimpleDB, CouchDB, Google App Engine, and Persevere may have a better way of storing data for your Web app

So you've got some data to store. In the past, the answer was simple: Hook up an official database, pour the data into it, and let the machine sort everything out for you while you spend your time writing big checks to the database manufacturer. Now things aren't so cut and dry. A fresh round of exciting new tools is tacking the two letters "db" onto a pile of code that breaks with the traditional relational model. Old database administrators call them "toys" and hint at terrible dangers to come from the follies of these young whippersnappers. The whippersnappers just tune out the warnings because the new tools are good enough and fast enough for what they need.

The non-relational upstarts are grabbing attention because they're willfully ignoring many of the rules that codify the hard lessons learned by the old database masters. The problem is that these belts-and-suspenders strictures often make it hard to create really, really big databases that suck up all of the cycles of a room full of machines. Because all Web application designers dream of building a startup that needs a really big room filled with machines to hold all of the data of all of the users, the rules need to be bent or even broken.

[ For a brief look at more alternative databases, see Open source and SaaS offerings rethink the database. Catch InfoWorld's cloud computing reviews and analysis: Cloud versus cloud: Amazon, Google, AppNexus, and GoGrid | Inside Amazon Web Services | App builders in the sky | Windows Azure Services Platform gives wings to .Net | What cloud computing really means. ]

The first thing to go is the venerable old JOIN. College students used to dutifully work through exercises that taught them how to normalize the data by breaking the tables up into as many parts as practical. Disk space was expensive then, and a good normalization expert could really pack in the data. The problem is that JOINs are really, really slow when the data is spread out over several machines. Now that disk space is so cheap and many of the data models don't benefit as much from normalization, JOINs are easy to leave behind.

The next trick is to start using phrases like "eventual consistency." Amazon's documentation for SimpleDB includes this inexact promise: "Consistency is usually reached within seconds, but a high system load or network partition might increase this time." The new twerps really get those codgers steamed when they talk about how all of the computers in the cluster will get around to replicating the data and giving consistent answers when the machines are good and ready. For the kids, consistency is akin to cod liver oil or making your bed in the morning.

Test Center Scorecard
25%25%20%20%10%
Amazon SimpleDB88898
8.2
Very Good
25%25%20%20%10%
Apache CouchDB77879
7.4
Good
25%25%20%20%10%
Google App Engine88898
8.2
Very Good
25%25%20%20%10%
Persevere Server87879
7.7
Good
Close

On Twitter now

Database management systems

Powered by Twitter
additional resources
White Paper - How to Improve Delivery of Advanced Web Applications

White Paper

Virtual Workforce: The Key to Expanding The Business While Cutting Costs

Get the independent advice and expertise you need to support a virtual workforce.

Go inside:
The three-step approach to making a virtual workforce a reality.
The four flavors of client virtualization technologies.
The three key initiatives that solve IT challenges.
Download now »
White Paper: Successfully Secure Your Wireless LAN With Wi-Fi firewalls.

White Paper

Addressing Linux Threats Leveraging Fewer Resources

The increase in Linux popularity has increased the frequency and sophistication of malware attacks. Read this 2 page white paper now to learn how you can protect your Linux environment with real-time protection that is certified by all major Linux vendors.

Download now »
White Paper - The 2009 Handbook of Application Delivery

White Paper

The 2009 Handbook of Application Delivery

Ensuring acceptable application delivery will become even more difficult over the next few years. As a result, IT organizations need to ensure that the approach that they take to resolving the current application delivery challenges can scale to support the emerging challenges. This handbook elaborates on the key tasks associated with planning, optimization, management and control and provides decision criteria to help IT organizations choose appropriate solutions.

Download now »
White Paper - Is Your Backup System Outdated?

White Paper

Mid-range Storage Considerations

A common misconception is that mid-range storage requirements are dramatically different than that of a larger enterprise. Mid-range storage users may require less capacity, but they have similar functionality and management requirements. This ESG paper examines mid-range storage needs and reviews a new solution that adjusts size while retaining value, performance and functionality.

Download now »
bosley 4-Apr-09 1:22pm
Take a look at DovetailDB. http://www.millstonecw.com/dovetaildb/ I found DovetailDB today and I was up and running very quickly. It's simple, straightforward, robust, very friendly, and not overly ambitious because it's easy to extend. My search is over.

Sign up to receive InfoWorld Resource Alerts

Subscribe to the Today's Headlines: First Look Newsletter

Find out what will be news for the day, with our first-thing-in-the-morning briefing.

©1994-2010 Infoworld, Inc.