First look: Couchbase’s new SQL for NoSQL

Couchbase Server 4.0 addresses NoSQL’s biggest pain point with SQL-like query language for its document datastore

First look: Couchbase’s new SQL for NoSQL

Couchbase might seem like a bit of an outsider in the world of NoSQL datastores. After all, MongoDB grabs most of the limelight, while Cassandra and HBase have sewn up most of the big data world, and Redis has pretty much supplanted Memcache as the key/value cache that people reach for by default. But Couchbase has not been sitting on the sidelines looking in. You might not know it from Hacker News, but the use of Couchbase Server has been growing steadily for the past couple of years.

More to the point, the new, recently released Couchbase Server 4.0 has some features that will continue to improve its standing in the enterprise world. Its most important introduction, a SQL-like query language called N1QL, might even get Couchbase noticed by the technology hipsters.

The company describes N1QL as “the complete flexibility of JSON and the full power of SQL.” I wouldn’t go so far, as it’s only a subset of SQL that's supported, but it brings Couchbase development within reach of DBAs and business analysts, who have often been left behind in the NoSQL world.

One of the downsides that has plagued Couchbase Server for a few years now has been the need to write dedicated map/reduce views in JavaScript for any sort of query operation. These have been operationally problematic (a production view cannot be edited once deployed, only copied back to development and resubmitted) and present a relatively large hurdle for newcomers to the product.

The DBA’s revenge

You’d be forgiven at this point for thinking that N1QL is high-quality trolling from Couchbase. A provider of a NoSQL datastore has implemented a new query language that is … SQL?

But SQL didn’t become the standard for data reporting and processing by accident; the relational algebra approach to modeling, reasoning, and interacting with data has a firm theoretical and practical basis, and SQL has evolved over the decades to be a useful tool known by countless thousands of developers across the globe. I’m all for stopping the practice of turning a JavaScript API into a pigeon-SQL DSL and embracing SQL properly, even if it sounds somewhat linguistically twisted when a NoSQL CEO says his company has introduced “SQL for NoSQL.”

Of course, N1QL isn’t quite SQL. It targets SQL ’92 as a basis (meaning that there’s no FILTER or other SQL 2003 goodies), but expands upon it to allow exceptional querying of Couchbase’s JSON-based document store. Here’s how you would write a simple query in N1QL to get the list of names of customers from the United Kingdom from a document store named customer:

SELECT name FROM customer WHERE country = ‘UK’

You can access children of the document using the . operator. N1QL also provides a standard set of array operations. For example:

SELECT children[0].name FROM parents

This query would return the first child’s name from every parent in the document. And there’s more! Children of a document can be unnested and joined with the parent object, with the UNNEST keyword:

SELECT p FROM product p UNNEST p.categories as category WHERE category=“White Goods”

Plus, N1QL contains all of the usual aggregations -- COUNT, AVG, ROUND, and so on -- that you'd find in your favorite SQL implementation. And behind the scenes, N1QL acts as you’d expect, sporting a query analyzer that takes your query and provides the most efficient query plan that will operate across your Couchbase Server deployment to return the result. Yes, you can use EXPLAIN to get the analyzer to show that it’s working.

The ability to leverage SQL with JSON and a schema-less datastore is going to be terrific. It should also help offset one of the big problems with schema-less stores, which is that you eventually end up baking a schema into your application code. This is in some respects even more limiting than having a defined schema, as it’s spread around your code in bits and pieces. (Indeed, it was announced this week that schemas are coming to the poster child of NoSQL, MongoDB; it’s not only Couchbase starting to look toward the world of RDBMS as enterprise support becomes more and more important.)

But what happens to views? According to Couchbase CEO Bob Wiederhold, they’re not going away. The company will continue to improve views in upcoming versions. However, although Wiederhold says views will still have a place for writing incremental map and reduce jobs in JavaScript, I imagine we’ll see them slowly fading away as N1QL becomes the standard for interacting with Couchbase documents.

There are some caveats and concerns with N1QL. Performance has been an issue in the developer previews, and some users, including John De Goes (CTO of SlamData), think that N1QL has been designed to solve a limited number of use cases. As De Goes puts it, “If your use case is not one of those top 10 use cases that they thought to build into the syntax of their SQL dialect, then you can’t solve the problem.” Indeed, in the initial implementation, standard SQL features such as transactions are not supported. But they are on Couchbase’s road map, so we’ll see how this pans out in the months ahead.

Smarter scaling

Meanwhile, from an operational point of view, another innovation that Couchbase Server 4.0 brings is multidimensional scaling, which appeals to the system administrator in me. It allows individual scaling of services such as querying, indexing, and data storage to improve performance while making efficient use of available resources. While the traditional method for scaling NoSQL datastores is adding an identical new node for performance improvements, Couchbase allows you to scale horizontally in a more efficient manner.

Need faster query responses? Then add nodes with more powerful CPUs than the rest of the cluster and let them take the strain. Need faster indexing? Then add nodes outfitted with SSDs for indexing at lightning speed while your data nodes remain on cheaper, traditional hard disks. The latter option gets a boost in Couchbase Server 4.0 with ForestDB, a new homegrown storage engine specifically designed to get the most out of SSD storage, while also improving their lifespan. This scaling can happen in an elastic manner, and perhaps most important, the deployments can be isolated, so your services will not suffer from resource contention.

This ability to customize deployments to squeeze out the best performance and maximize the use of available hardware is going to be a godsend for the larger-scale deployments where certain aspects of the service need to be more performant than others. But for all deployments, you no longer need to carry around the baggage of extra services when you don’t need them. That said, standard horizontal scaling is still there if you want it. 

It seems that this year is definitely the year where NoSQL seems to be getting enterprise religion and growing up, and Couchbase Server 4.0 is one to keep an eye on. Embracing SQL with N1QL allows it to draw upon all of the DBAs and business analysts still out there in enterprise silos across the world, making them productive almost as soon as they’re first introduced to it, as opposed to making them learn specific DSLs. You can see this in the big data world as well, with Spark and Flink beginning to expose higher-level concepts to the user.

Integration has also been a theme with NoSQL databases this year, and Couchbase is no different. Everyone is pointing to operations as the next great frontier. It will be interesting to see how the NoSQL space progresses with everyone playing in SQL land and everyone trying to introduce a better management console.

Copyright © 2015 IDG Communications, Inc.

InfoWorld Technology of the Year Awards 2023. Now open for entries!