The best distributed relational databases

These SQL relational databases offer both horizontal scalability and support for ACID transactions—some on a global scale

Relational SQL databases, which have been around since the 1980s, historically ran on mainframes or single servers—that’s all we had. If you wanted the database to handle more data and run faster, you had to put it on a bigger server with more and faster CPUs, memory, and disk. In other words, you turned to vertical scalability or “scale up.” Later on, if you needed the ability to fail-over to improve availability, you could collocate a hot back-up server with the active server in an “active-passive” cluster, typically with shared storage.

The four ACID properties—atomicity, consistency, isolation, and durability—are required to ensure that database transactions will always be valid, even in the event of network partitions, power failures, and other errors. It is relatively easy for a database on a single server to conform to all four ACID properties; it’s a bit harder to implement for a distributed database.

NoSQL databases, introduced around 2009, offered horizontal scalability (meaning that they could run on multiple servers) but often lacked full ACID compliance, and usually did not support the SQL language as such. NoSQL databases introduced the idea of “eventual consistency,” meaning that if you wrote to the database from one server and read it from another right away, you might not see the same results as you would reading it from the server to which you had just written. If you waited long enough, however, the new data would be replicated to all the servers in the cluster and would then be consistent. Eventual consistency is good enough for many applications, such as online catalogs, but not good enough for finance.

Recently, several “scale-out” SQL databases have been introduced that are horizontally scalable. Even better, some of them can handle having geographically distributed servers without sacrificing consistency. Extremely distant server nodes take longer to update than local nodes because of the limit imposed by the speed of light, but several techniques can mitigate that problem, including the use of consensus group quora and very high-speed networking and storage.

To continue reading this article register now