CockroachDB may sound like a joke project, but its creators are on a serious mission: Create a distributed SQL database that can scale out well, survive most mishaps (like its namesake is said to be able to survive nearly anything), and make it easy to write applications.
Originally announced back in 2014, CockroachDB is now available in its first public beta. The company behind it, Cockroach Labs, is clearly hoping CockroachDB's design will appeal to creators of database-driven apps.
Data that never dies
In a talk given at CoreOS Fest 2015, Cockroach Labs CEO Spencer Kimball described the key principles behind CockroachDB. Because of the growing amount of data collected by modern database-driven applications, he said, databases need to be able to scale horizontally by default without tweaking or tuning.
Likewise, Kimball noted, databases need to be able to recover as automatically as possible from disasters. Applications built with those databases shouldn't have to make accommodations when data is rebalanced across nodes or when a data center melts down.
Consistency is another key element long thought difficult to achieve with distributed databases. Kimball wanted CockroachDB to provide "one truth, everywhere" -- to be a provider of strong consistency (via the Raft project) rather than the eventual consistency commonly associated with distributed systems like Cassandra or Couchbase.
This issue was deemed so important that Cockroach Labs included SQL support in the CockroachDB beta, though it meant delaying its release for six months. SQL is widely understood and leveraged, and those who use SQL expect strong consistency from the systems they connect to.
A sense of purpose
In a phone interview earlier this week, Kimball noted that the delay not only provided time to figure out how to make SQL work well with CockroachDB, it also helped potential users better understand what the application was intended to be.
"We were going to originally launch with what was called a 'key/value' interface," Kimball said. "[But] that's not that useful for application developers. ... It would have caused this long, lingering confusion about whether CockroachDB was this transactional data store or 'Oh, I've heard those guys are going to do SQL eventually, so which one is it? How should we use it?'"
The initial beta of CockroachDB will not, however, be ANSI SQL compliant. Joins, for instance, are not yet supported, lthough there are plans to make it happen in the 1.0 release. Likewise, while distributed transactions (such as writes) are supported, distributed queries -- required by analytic workloads for high parallelism -- aren't part of the package yet.
Kimball is of the opinion that these items don't have to be part of the initial release for CockroachDB to find an audience. "We are trying to appeal to the very long tail of developers," he said. "We're not specifically looking for the big, obvious use cases; we expect companies to adopt it down the road.
"As an open source product, and as an OLTP database, you have this chicken-and-egg problem," Kimball said. "It's hard for someone to trust an OLTP database until you have a certain level of maturity and a certain amount of adoption that indicates the credibility of the product."