I'm at the Cloud Connect 2010 conference in Santa Clara, Calif., one of the first major gatherings of the year on cloud computing. One of the larger topics that has come up thus far is not using relational databases for data persistence. Called the "NoSQL" movement, it is about leveraging more efficient databases that are perhaps able to handle larger data sets more effectively. I've already written about the "big data" efforts that are emerging around cloud, but this is a more fundamental movement to drive data back to more primitive, but perhaps some more efficient models and physical storage approaches.
NoSQL systems work with data in memory, typically or uploading chunks of data from many disks in parallel. The issue is that "traditional" relational databases don't provide the same models and, thus, the same performance. While this was fine in the days of databases with a few gigabytes of data, many cloud computing databases are blowing past a terabyte, and we'll see huge databases supporting cloud-based systems going forward. Relational databases for operations on large data sets are contraindicated because SQL queries tend to consume many CPU cycles and thrash the disk as they process data.
[ Get the no-nonsense explanations and advice you need to take real advantage of cloud computing in the InfoWorld editors' 21-page Cloud Computing Deep Dive PDF special report, featuring an exclusive excerpt from David Linthicum's new book on cloud architecture. | Stay up on the cloud with InfoWorld's Cloud Computing Report newsletter. ]
If you think we've heard this song before, you are correct. Object and XML databases made some inroads back in the 1990s, but many enterprises kept the relational databases around, such as Oracle, Sybase, and Informix, despite the fact that many nonrelational databases did indeed provide better performance. However, the cost and risks of moving from relational databases, as well as the relatively small sizes of the databases, kept it pretty much a relational world.