Structured data stored in relational databases has ruled the world for the last 40 years. Over that time, Structured Query Language (SQL) emerged as the standard for accessing and manipulating data stored in relational database management systems. The main reason for the popularity of SQL was the ease of programming it provided by encapsulating and abstracting how data is stored, thereby removing a step in the process and allowing developers to focus on what they wanted done. Thus, SQL drove the adoption of relational databases to near ubiquity.
However, we’ve started to hit the limitations of what relational systems can do. Data no longer follows a uniform structure. With the digitization of communication and commerce, we now have social data, scientific data, IoT data, blogs, tweets, and other data that do not fit the relational structure. Additionally, today’s businesses demand agility and rapid changes to applications, which means frequent changes to the schema of the data.
The rigid schema requirements of relational databases are a roadblock to releasing fast, scalable, and responsive applications. Developers and enterprises are increasingly expected to bring products to market faster and cheaper. The need for dynamic schema evolution demands not only a rethink of data models and databases, but also a new method to access this data -- a query language.
A new data model
Before we get to the query language, let’s first examine the data model. Since the early '90s, most business applications were developed in object-oriented programming models. The popularity of graphical user interfaces and subsequently the Web made this type of programming the norm for developing business and customer-facing applications. In Web applications specifically, JSON is the open standard format that uses human-readable text to represent data in objects. JSON is what gets transmitted between server and Web applications. NoSQL databases designed to store and manage JSON documents started gaining popularity in the early 2000s thanks to the increase of unstructured data.
More and more businesses are adopting NoSQL databases to support the broad set of use cases for the next-generation of personalized, context-sensitive, and location-aware applications. Modern developers love the flexibility of a JSON database because it represents data in the same object-based way as their preferred languages (Java, C++, .Net, Python, Ruby, and so on) without the rigid schema requirements of a relational database. However, developers building on a NoSQL database with a JSON data model have been limited by the lack of a query language.
As a result, the emergence of JSON-based NoSQL databases without a standard rich query language forced programmers into a dilemma: Either leverage the power of standard SQL, but be constrained by a rigid relational model, or develop on a flexible JSON data model, but accept data model and query limitations.
While the benefits of using JSON are clear, a standard query language making it easier for developers to build applications is also important. To give JSON the query language it deserves, a clear starting point was to simply extend the most popular query language, SQL.
A new query language
N1QL (pronounced "nickel") is a new query language that extends SQL to work on JSON documents. Put in more technical terms, JSON represents a non first normal form data model (N1NF), and N1QL operates on that data model. N1QL extends SQL from traditionally operating on tables and rows (tuples) to operate on JSON (nested documents). It’s built on a nested recursive algebra for a nested recursive data model.
Extending SQL to JSON is like reinventing the gas-powered car by giving it an electric engine, but not changing the steering wheel or any of the mechanisms that affect how you operate the car. Developers can now dynamically extend an application’s schema (handled at runtime by the new query engine), while still using the same familiar SQL language to operate it.
The image below shows how you would write a JOIN in SQL and how you would write a JOIN in N1QL. This is a very simple example. To learn more about what can be done with N1QL visit the Couchbase N1QL tutorial.
Building a new language as an extension to SQL provides the advantage of using the same vocabulary and syntax as SQL. Thus, for the first time, developers can perform complex references using JOINs in NoSQL document databases. In addition, all the standard SQL language elements such as statements, clauses, expressions, predicates, operators, aggregation, and ordering remain the same.
However, because the underlying data model is different and schema evolution is dynamic at runtime, some additions needed to be made to extend SQL’s power to JSON. N1QL adds the verbs
UNNEST for composing and flattening nested objects, a family of operators (such as
IS MISSING) for handling dynamic schema, and
ARRAY functions and operators for traversal, filtering, and recursive processing of array elements.
For the next phase of application development, developers need new query languages that are architected for schema flexibility and schema evolution. By pairing the flexibility of an innovative data model with the power of a query language known by millions of developers and business analysts, we lay the foundation for building much more flexible and powerful applications.
JSON and distributed document-oriented databases are here to stay. Let’s make the most of them.
Ravi Mayuram is senior vice president of products and engineering at Couchbase.
New Tech Forum provides a venue to explore and discuss emerging enterprise technology in unprecedented depth and breadth. The selection is subjective, based on our pick of the technologies we believe to be important and of greatest interest to InfoWorld readers. InfoWorld does not accept marketing collateral for publication and reserves the right to edit all contributed content. Send all inquiries to firstname.lastname@example.org.