NoSQL standouts: New databases for new applications
Cassandra, CouchDB, MongoDB, Redis, Riak, Neo4J, and FlockDB reinvent the data storeFollow @peterwayner
Was it just two or three years ago when choosing a database was easy? Those with a Cadillac budget bought Oracle, those in a Microsoft shop installed SQL Server, those with no budget chose MySQL. Everyone in between tried to figure out where they belonged.
Those days are gone forever. Everyone and his brother are coming out with their own open source project for storing information. In most cases, these projects are tossing aside many of the belts-and-suspenders protections that people expect from the classic databases. There are enough of them now that some joker started calling them NoSQL and claiming, perhaps tongue-in-cheek, that the acronym stood for Not Only SQL.
[ Also on InfoWorld: Bossie Awards 2011: The best open source software of the year | Get the key insights on open source news and trends from InfoWorld's Technology: Open Source newsletter. ]
Everywhere you look, there are new NoSQL databases -- or "data stores," if you're one of those who feel that the word "database" can only be used by proper relational software offering belts-and-suspenders compliance with the ACID rules. Some of these new databases are quite sophisticated, while others are deliberately bare bones. But all are intended to deliver high performance by trading away the power of a relational database. Let the banks and their nervous programmers worry that Aunt Millie's pension check is deposited correctly, they say. You can't get your kicks if you're constantly checking everything in triplicate.
In most cases, the NoSQL rebels succeeded in building something that's blazingly fast and fairly scalable -- but only by abandoning traditional crutches. Old school DBAs are shaking their heads and chuckling through the presentations because they're sure the whippersnappers are going to stumble over the problems the veterans have already fixed. But the whippersnappers don't care because they have different project needs in mind. They're aiming at new targets.
What is surprising is how different the NoSQL projects are turning out to be. Whereas the old relational space largely converged on a set of features and a standard language, these new databases are all built by people going in their own direction. The packages may take basic pairs of keys and values, but they're tuned for different use cases. The major variations aren't in the format of the data but in how often it's replicated, cached, and sharded.
For instance, do you store data that is often retrieved, such as a person's email address? Or is the data squirreled away for a rainy day, as with log files? Do you expect many users with a small amount of data or just a few users with large volumes of data? Can your users survive if you lose one of their rows of data, or will they start to sue?
There's a different project for each answer. In the past, each architect would tweak the configuration of MySQL or Oracle in a different way. Now, the architect chooses a completely novel project.
There are great advantages to this Babelization if the needs of your project fit the abilities of one of the new databases. If they line up well, the performance boosts can be incredible because the project developers aren't striving to build one Dreadnought to solve every problem.
The experimentation is also fun because the designers don't feel compelled to make sure their data store is a drop-in replacement that speaks SQL like a native. They're coming up with new query languages and making different decisions about storing things like binary data. It has all the buzz of innovation.