DataStax primps Cassandra for developers

Eyeing Oracle, CEO Billy Bosworth says DataStax Enterprise 3.1 eliminates some of the complexity that's deterred holdouts

Since its inception in 2010, DataStax has managed to convert an impressive lineup of enterprise customers -- including Adobe, eBay, Netflix, and Sony -- to its DataStax Enterprise database. While fans of the nonrelational database platform praise its high level of availability, flexibility, and scalability, holdouts have been put off by Cassandra's inherent complexity.

That's why the company has concentrated its effort on making the newly released version 3.1 of DataStax Enterprise more user-friendly and, thus, more appealing to developers, according to company CEO Billy Bosworth. Bosworth -- who spent a couple of decades neck-deep in relational databases before joining DataStax -- said Cassandra's been fairly criticized as "a little challenging to use."

"It's not going to be the most intuitive piece of software you ever started working with," he told IDG Enterprise Chief Content Office John Gallant in an interview. "When I first looked at Cassandra in 2010, I had that same feeling. 'Whoa, this thing is bit challenging to get your arms around.'"

DataStax Enterprise, for the uninitiated, is a database that integrates Apache Cassandra for handling real-time data; Apache Hadoop for batch analytics; Apache Solr for search; and OpsCenter for visual monitoring and management. Atop all of that is a security overlay.

DataStax first unveiled DataStax Enterprise in 2011 and has seen significant adoption, according to Bosworth. "Our customers are the application kings who are building these online, mission-critical systems ... but they have a very high demand around this data to be hot. It has to be transactional, ready, fast, very highly performance -- but they are going to want to ask different things of that data," said Bosworth. "They are going to want to ask some analytical questions. They are going to want to do some very deep searching, which requires indexing [and] is very taxing on a system."

DataStax was designed to fit that bill, Bosworth said. "We combine those [open source technologies] into a single system ... and now, you don't have to worry about any ETL. You don't have to move your data anywhere. Our mission for the application team is to keep your hot data hot and to let you do different things with it -- but to never impact performance and increase operational complexity."

Focused on developers

Over the past year, the company has turned its focus to meeting the needs to developers, Bosworth said. Among the new version's improvements, the company has refined CQL3 (Cassandra Query Language) to make it more familiar for developers who work with relational databases. "If you know SQL, you're going to be very, very comfortable as soon as you look at CQL, because it's a subset of [SQL]," Bosworth said. "It offers developers one standard way to be able to interact with the database."

DataStax has introduced its own set of drivers (for .Net, Java, and so on), which it also develops and maintains in-house. "We assure you [these drivers] will work and work properly," Bosworth said. "We've had these drivers scattered around the open source community, and no one sure which is the right version."

DataStax Enterprise 3.1 yields superior scalability, according to the company: It lets users manage up to 10 times as much Cassandra data per node in some use cases. What's more, version 3.1 integrates with Apache Solr 4.3, which offers new features for speedier search performance, new memory caches, and monitoring functionality, as well as greater reliability, according to the company.

DataStax also says it has simplified manageability, introducing virtual nodes and parallel operations such that users can increase capacity and perform maintenance operations more quickly. In addition, the company has introduced tracing features for deeper visibility into response times and other database operations.

The Oracle obstacle

DataStax has its work cut out as it takes on what may be its chief enterprise rival, Oracle. "We do see right now a lot of discussion and debate at multiple levels in the organization around the Oracle footprint that already exists," said Bosworth.

By Bosworth's reckoning, Oracle's database platform isn't sufficiently scalable for the cloud world: "When you start walking in and implementing these systems where you're seeing [data] in real time ... they're taking those Oracle apps and starting to say, 'These are hitting a wall. We are not interested in necessarily in the Exagrid strategy that Oracle is proposing. We want to look at something new and different.'"

At a higher level, Bosworth had one piece of advice from the relational database world into the cloud-centric, mobile-intensive nonrelational world: Change the way you think about your data model.

"You have to change how you think about blueprinting your data inside a database. If you don't get that right, everything gets exponentially harder," Bosworth said. "Once you get the data model right, you are going to find that you are not going to be intimidated when you start to look at how you can access and interface the database. We're making that easier and easier for relational-minded people -- but you do have to put in the legwork around the data model itself."

This article, "DataStax primps Cassandra for developers," was originally published at Get the first word on what the important tech news really means with the InfoWorld Tech Watch blog. For the latest business technology news, follow on Twitter.

Copyright © 2013 IDG Communications, Inc.