I'm at the Cloud Connect 2010 conference in Santa Clara, Calif., one of the first major gatherings of the year on cloud computing. One of the larger topics that has come up thus far is not using relational databases for data persistence. Called the "NoSQL" movement, it is about leveraging more efficient databases that are perhaps able to handle larger data sets more effectively. I've already written about the "big data" efforts that are emerging around cloud, but this is a more fundamental movement to drive data back to more primitive, but perhaps some more efficient models and physical storage approaches.
NoSQL systems work with data in memory, typically or uploading chunks of data from many disks in parallel. The issue is that "traditional" relational databases don't provide the same models and, thus, the same performance. While this was fine in the days of databases with a few gigabytes of data, many cloud computing databases are blowing past a terabyte, and we'll see huge databases supporting cloud-based systems going forward. Relational databases for operations on large data sets are contraindicated because SQL queries tend to consume many CPU cycles and thrash the disk as they process data.
[ Get the no-nonsense explanations and advice you need to take real advantage of cloud computing in the InfoWorld editors' 21-page Cloud Computing Deep Dive PDF special report, featuring an exclusive excerpt from David Linthicum's new book on cloud architecture. | Stay up on the cloud with InfoWorld's Cloud Computing Report newsletter. ]
If you think we've heard this song before, you are correct. Object and XML databases made some inroads back in the 1990s, but many enterprises kept the relational databases around, such as Oracle, Sybase, and Informix, despite the fact that many nonrelational databases did indeed provide better performance. However, the cost and risks of moving from relational databases, as well as the relatively small sizes of the databases, kept it pretty much a relational world.
However, the cloud changes everything. The requirement to process huge amounts of data in the cloud is leading to new approaches to database processing, based on older models. MapReduce, the fundamental way Hadoop processes data, is based on the older "share-nothing" database processing model for years ago, but now we have the processing power, the disk space, and the bandwidth.
I suspect that the movement to cloud computing will indeed reduce the use of relational databases. It's nothing we haven't heard of before, but this time we have a true need.
This article, "SQL and relational databases: They're not right for the cloud," was originally published at InfoWorld.com. Read more of David Linthicum's Cloud Computing blog and follow the latest developments in cloud computing at InfoWorld.com.