CrateDB 2.0 Enterprise stresses security and monitoring—and open source

The open source database for processing high-speed freeform data with SQL queries now has enterprise features, available as open source for faster developer uptake

When open source SQL database CrateDB first debuted, its professed mission was to deliver easy, fast analytics on reams of machine-generated data, while running in containerized, cloud-native environments.

That mission hasn't changed with the release of version 2.0, but it has been expanded by way of an enterprise edition with pro-level features. Rather than distribute the enterprise edition as a closed-source, binary blob, the maker of CrateDB is offering it as open source to help speed uptake and participation.

SQL, not slow-QL

CrateDB is designed to ingest high-volume, machine-generated data, whether logs from a fleet of servers or sensor data from IoT devices, and make that data accessible through traditional SQL queries. The data may be structured or unstructured; it can be a conventional table, or a freeform JSON document.

Many of the pieces in CrateDB are familiar open source success stories -- the Elastic stack, the Lucene indexing engine, and Facebook's Presto distributed SQL query engine. A shared-nothing architecture allows the database to scale horizontally, and the Docker-based setup process makes adding nodes as easy as spinning up additional containers.

Version 2.0 extends the existing feature set incrementally, not dramatically. Queries across clusters now work faster for aggregations and GROUP BY operations, and SQL operations gain a number of improvements, like the ability to perform aggregations on JOINs. A new index structure speeds up queries on geospatial data and IP address data, both of which frequently figure into CrateDB's intended use case of machine-generated data.

Enterprise, beam us up

The biggest changes for CrateDB 2.0 involve a new enterprise-level edition of the product. The feature set there mostly involves administration -- stronger authentication and authorization functions, and database performance monitoring tools. For developers, the enterprise version provides user-defined functions, by way of either SQL or JavaScript., the commercial outfit behind CrateDB, will be offering hosted, managed-service instances of the database. On top of that, the source code of the enterprise version is freely available; enterprise licensing essentially consists of the right to run that edition in commercial production. Individual, non-commercial, and non-profit users don't have to pony up for a license.

This approach has some precedent. CockroachDB -- another open source, shared-nothing, super-scalable database -- is trying the same strategy, likewise banking on big enterprises to do the right thing and buy licenses.

For both CockroachDB and CrateDB, the strategy is to put the enterprise version into as many hands as possible by offering it as open source -- especially developers' hands. CrateDB's emphasis on conventional SQL figures into this as well, considering the broad base of existing database developers comfortable with SQL.

If the future of databases is containerized, cloud-native, horizontally scalable, and free of constraints on data types, a good way to get developers into that world is to let them work, as much as possible, in the ways they already know well.