The Apache Cassandra NoSQL distributed data store continues to accumulate features that mimic traditional databases, with the newly released version 2 of the open source software offering triggers, lightweight transactions and an updated query language similar to SQL.
"A lot of our development is driven [to dissolve] the pain points that our users are feeling the most distress over," said Cassandra vice president Jonathan Ellis, about the release of Cassandra 2.0, which is managed under the Apache Software Foundation.
[ Andrew C. Oliver answers the question on everyone's mind: Which freaking database should I use? | Keep up with the latest approaches to managing information overload and compliance in InfoWorld's Enterprise Data Explosion Digital Spotlight. ]
First created for Facebook, Cassandra is a NoSQL distributed data store easily able to handle large amounts of writes and reads, a quality that has won favor with both high volume Internet services as well as with those firms executing big data-styled analysis.
Organizations like Adobe, CERN, Comcast, eBay, GoDaddy, Hewlett-Packard, IBM, Instagram, Netflix, and Sony all use the software.
Many of the changes in the new version offer capabilities long enjoyed by relational databases, capabilities that could make Cassandra suitable as a replacement for a traditional database, at least for some use cases.
Perhaps the most notable feature is the support for lightweight transactions, which guarantees that any one data store operation isn't interrupted by any other operation. "We're the first eventually consistent database to implement lightweight transactions," Ellis said.
Long a feature of traditional SQL databases, lightweight transactions assure that, for instance, two accounts with the same user name can't be created at the same time. It essentially locks data that is being read or updated by an operation so another operation doesn't change the data mid-transaction, or reads data that is about to be rendered outdated.
The Cassandra project team found that the lack of support for lightweight transactions in Cassandra had motivated some users to run two databases instead of one.
Such users had split off the most highly consulted portions of their relational databases to run under Cassandra for speedier performance. But they didn't migrate their entire databases to Cassandra because of the concerns around lock management. Others used an external locking mechanism such as Apache ZooKeeper, which brought a new set of complexities.
The new version of Cassandra also reintroduces an old database concept called triggers, a form of stored procedures. Two decades back, triggers had been used with traditional databases to centralize calculations in the database itself, in order to improve consistency of results across different applications that used the database.