The best distributed NoSQL databases

Highly flexible and hugely scalable, NoSQL databases offer a range of data models and consistency options to suit your application

1 2 Page 2
Page 2 of 2

By default, MongoDB uses dynamic schemas, sometimes called schema-less. The documents in a single collection do not need to have the same set of fields, and the data type for a field can differ across documents within a collection. You can change document structures with dynamic schemas at any time.

Schema governance is available, however. Starting in MongoDB 3.6, MongoDB supports JSON schema validation, which you can turn on in your validator expression.

Read my review of MongoDB.


Redis is an open source, in-memory data structure store, used as a database, cache, and message broker. It supports data structures such as strings, hashes, lists, sets, sorted sets with range queries, bitmaps, hyperloglogs, geospatial indexes with radius queries, and streams. Redis has built-in replication, Lua scripting, LRU eviction, transactions, and different levels of on-disk persistence. Redis provides high availability via Redis Sentinel and automatic partitioning with Redis Cluster.

Redis Enterprise is a fully durable multi-model database. It supports key-value, document, graph and time series data, probabilistic data structures, comprehensive search, stream processing, and serving deep learning and AI models.

Yandex ClickHouse

Yandex ClickHouse is an open-source, column-oriented OLAP database management system that manages extremely large volumes of data, including non-aggregated data, in a stable and sustainable manner, and allows generating custom data reports online in real time. The system is linearly scalable and can be scaled up to store and process trillions of rows and petabytes of data.

ClickHouse is designed to work on regular hard drives, which means the cost per GB of data storage is low, but SSD and additional RAM are also fully used if available. (By contrast, SAP HANA can only work in RAM.) ClickHouse does parallel processing on multiple cores.

In ClickHouse, data can reside on different shards. Each shard can be a group of replicas that are used for fault tolerance. The query is processed on all the shards in parallel.

ClickHouse supports a declarative query language based on SQL that is identical to the SQL standard in many cases. Supported queries include GROUP BY, ORDER BY, subqueries in FROM, IN, and JOIN clauses, and scalar subqueries. Dependent subqueries and window functions are not supported.

Although ClickHouse does support data inserts and mutations, it was not designed for OLTP. Yandex recommends inserting data in packets of at least 1,000 rows, or no more than a single request per second. No locks are taken when new data is ingested.

ClickHouse uses asynchronous multi-master replication. After being written to any available replica, data is distributed to all the remaining replicas in the background.

ClickHouse was developed to support Yandex.Metrica, the second largest web analytics platform in the world. This application currently uses 394 servers located in six geographically distributed data centers, handling more than 13 trillion records in the database and more than 20 billion events daily.


YugaByte DB is an open-source, transactional, high-performance database for planet-scale applications that supports three API sets: YCQL, compatible with Apache Cassandra Query Language (CQL); YEDIS, compatible with Redis; and PostgreSQL.

YugaWare is the orchestration layer for YugaByte DB Enterprise Edition. YugaWare makes quick work of spinning up and tearing down distributed clusters on Amazon Web Services, Google Cloud Platform, and Microsoft Azure. YugaByte DB implements multi-version concurrency control (MVCC), which it uses for non-locking reads.

YugaByte Enterprise supports read replicas, multi-cloud clusters, and comprehensive monitoring and alerting without any configuration. It also features in-flight and at-rest encryption, one-click distributed backups and restores for clusters of any size, and auto-tiering of cold data to cheaper storage.

Read my review of YugaByte DB.

Copyright © 2019 IDG Communications, Inc.

1 2 Page 2
Page 2 of 2
InfoWorld Technology of the Year Awards 2023. Now open for entries!