CouchDB 2.0 counters MongoDB with improved scaling

The new version of the CouchDB NoSQL database adds open-sourced contributions from Cloudant, including clustering

virtual data wave pattern

CouchDB, the NoSQL database system from the Apache Software Foundation, is getting more than a change to the left of the decimal point with its new 2.0 release, now available as a developer preview. The project also received major code donations from Cloudant, whose work on improving CouchDB will now be donated directly back to the project rather than limited to paying customers only.

To take on the criticism that it doesn't run well at scale, CouchDB has added a new clustering technology, inspired by Amazon's Dynamo key-value store system, that allows for horizontal scaling. Jan Lehnardt, VP of Apache CouchDB, described in a phone call how the choice of Dynamo brought CouchDB "more in line with other industry standards," since other vendors' database systems also implement similar ideas, proven in the field, for their own clustering layers.

"CouchDB has been predominantly a single-server database," he said, "but it has always been designed with the idea that it could be run in a clustered environment -- that it could manage hundreds of machines. [The Apache Foundation] never actually went that far, but then Cloudant went ahead and built it all and were kind enough to donate all their work back to the Apache Software Foundation."

Lehnardt noted that the pressure to donate the work back to the community had been motivated by IBM acquiring Cloudant. IBM already had experience dealing with the Apache Software Foundation, he said, and while work on the new clustering mechanisms had started years earlier at Cloudant, IBM was convinced open-sourcing the project would be best for all involved.

CouchDB's long road to clustering can be partially traced to conscious design decisions and philosophical choices made by CouchDB's creators. As Lehnardt explained, "CouchDB has always said no to features that we know couldn't be scalable in a cluster or even doable in a cluster. This puts us in a position to migrate upward seamlessly." That is, existing users of CouchDB do not need to rework arrangements to take advantage of the new clustering features.

He contrasted this with other databases, where some features work fine in a single instance, but are limited when used in a clustered instance. "We kind of opted to make the users not too happy [with these original decisions]," he said, "but we knew what when we would grow up into a clustered system, they wouldn't have to change anything there."

The clustering mechanism isn't the only Cloudant project open-sourced for the sake of CouchDB 2.0. Another is Cloudant Query, a simplified way to query CouchDB via JSON and HTTP endpoints that's meant to echo the behavior of rival products like MongoDB. That may be part of an attempt -- for Cloudant and IBM, as much as Apache's -- to wrest market share and mind share away from the well-entrenched MongoDB.

Back when Couchbase 2.0 was released, InfoWorld's Andrew Oliver noted how that loosely CouchDB-based product "handles concurrency better [than MongoDB], and while actual high-end usage will tell, [Couchbase] architecturally looks like a more scalable model." Once CouchDB 2.0 -- and its derivative products -- are in the hands of more users, it'll finally be possible to put that thesis to the test.

Copyright © 2014 IDG Communications, Inc.