In an effort to combine the best of two database technologies, startup FoundationDB has launched a new data store that it claims can offer the reliability of transactional databases and the scalability and speed of NoSQL.
The data store, also called FoundationDB, is being marketed for organizations that want to consolidate their NoSQL databases into a single architecture.
[ Andrew C. Oliver answers the question on everyone's mind: Which freaking database should I use? | Keep up with the latest approaches to managing information overload and compliance in InfoWorld's Enterprise Data Explosion Digital Spotlight. ]
"Everyone is trying to figure out what the next generation platform will be," said David Rosenthal, co-founder of the company, based in Vienna, Virginia. Many organizations now have a mishmash of NoSQL systems, such as Cassandra or MongoDB, to execute various data storage jobs not well handled by traditional SQL databases. "It becomes an operational issue, having all these different clusters of computers to manage," Rosenthal said.
After over three years of development, the company has released a beta version of this database, which it says is ready for production use.
"People just coming into the NoSQL market will probably pick up MongoDB. But people who have been using those tools for several years, and have been burned by transaction and consistency issues, are looking for something with transactional integrity," said Rosenthal.
FoundationDB is not so much a database as a data storage engine, able to support multiple data storage models, Rosenthal said. The software stores data as simple key-value pairs, and offers a wide variety of data models, including models for storing graphs, documents, arrays, tables, and associative arrays.
FoundationDB doesn't offer the traditional SQL interface, but instead offers data access through C, Python, Ruby, Node.js and Java APIs.
"FoundationDB is a storage engine that can support [multiple] NoSQL data models. We can support a document data model to replace MongoDB, or support a key-value model to replace memcached, or support a graph model to replace Neo4J," Rosenthal said.
The secret to handling these different types of data models is transactional integrity, long thought to be impossible to achieve with distributed NoSQL-style databases.
Once you partition a database across multiple nodes, according to Eric Brewer's now famous CAP Theorem, that system can offer consistency (in which all nodes have the same data) or availability (in which the system always responds to an incoming requests even when some nodes are not working), but not both.
NoSQL data stores have grown in popularity over the past few years for offering the ability to easily scale across multiple nodes, even if many only offer what is called eventual consistency, in which data is not immediately synchronized across multiple nodes. The downside is that this could lead to different responses to the same query within short periods of time.
FoundationDB, however, has found a way to offer both availability and consistency through an agreement algorithm called Paxos, which ensures that multiple copies of the data -- the database keeps three copies of all data it stores -- stay synchronized. Google engineers also used Paxos in its Spanner global database architecture, though Google's setup is different from FoundationDB's, Rosenthal said.
"Transactions are fundamentally important for software engineering and for building solid abstractions," Rosenthal said.
The FoundationDB data store is designed to run across multiple servers. An average-size system might be a 24-node, 96-core system with 48 solid state drives (SSDs), capable of managing around 10 terabytes of data. Rosenthal said, at least for this initial release, it is not feasible for the software to manage petabytes of data.
The software will not be available as open source, though the company will offer a no-cost community version. The full general release should be available by the end of the year. The software runs on Linux, OS X, and Windows, as well as on Amazon's Elastic Cloud Compute (EC2).
Database analyst Curt Monash, of Monash Research, has warned against data stores that have been designed to support multiple data models, noting that "To date, nobody has ever discovered a data layout that is efficient for all usage patterns," he wrote in a recent blog post on the subject.
Nonetheless, FoundationDB is not the first attempt at combining features of traditional relational database management systems with those of NoSQL data stores. VoltDB, designed in part by database guru Michael Stonebraker, offers transactional capabilities in an in-memory database package, an approach, according to that company, that can rival NoSQL speeds. Oracle's latest version of its MySQL open source database offers a NoSQL memcached based API for faster data access.