FaunaDB review: Fast NoSQL database for global scale

Low latency, strong consistency, and high scalability make FaunaDB an excellent choice for greenfield web or mobile apps

FaunaDB review: Fast NoSQL database for global scale
-mosquito- / Getty Images
At a Glance

Distributed databases have become interesting and attractive in the last decade, as companies with world-wide operations require transactional databases with horizontal scalability and global reach. There’s an essential tension between geographic distribution and low transaction latency, however: The speed of light limits the transmission time between distant nodes.

editors choice award logo plum InfoWorld

To allow for high throughput on write transactions, many NoSQL databases have weakened their transaction support, either by prohibiting cross-partition transactions, or by downgrading their consistency guarantees from strong (synchronous transactions) to eventual (asynchronous transactions). Most databases use a two-phase commit scheme for transactions, which drives up the transaction latency when there is geographic distribution of nodes. However, many recent distributed databases use either a Paxos or Raft scheme for quorum-based transaction consensus, which lowers the transaction latency.

FaunaDB is a distributed, strongly consistent OLTP NoSQL database that is ACID compliant and offers a multi-model interface. It has an active-active architecture and can span clouds as well as continents. FaunaDB supports document, relational, graph, and temporal data sets from a single query. In addition to its own FQL query language, the product supports GraphQL, with SQL planned for the future.

FaunaDB is the first database to use the Calvin cross-shard transactional protocol, which allows for single-phase commits without reliance on clocks and without loss of consistency. FaunaDB also uses the Raft consensus system for individual shards. We’ll explain these in more detail when we discuss the FaunaDB architecture.

Competition for FaunaDB in the area of globally distributed NoSQL databases includes Azure Cosmos DB, Amazon DocumentDB, Amazon DynamoDB, and YugaByte DB. Google Cloud Spanner and CockroachDB are its globally distributed relational database competitors.

FaunaDB architecture

FaunaDB claims architectural innovations at every layer. The biggest innovation is probably the use of Calvin as a distributed transaction protocol instead of the Google Spanner or older Google Percolator protocols.

Calvin was originally described in a 2012 paper by Abadi et al. of Yale:

Calvin is a practical transaction scheduling and data replication layer that uses a deterministic ordering guarantee to significantly reduce the normally prohibitive contention costs associated with distributed transactions. Unlike previous deterministic database system prototypes, Calvin supports disk-based storage, scales near-linearly on a cluster of commodity machines, and has no single point of failure. By replicating transaction inputs rather than effects, Calvin is also able to support multiple consistency levels—including Paxos-based strong consistency across geographically distant replicas—at no cost to transactional throughput.

In other words, using Calvin for distributed transactions gives FaunaDB single-phase commit and the option to guarantee strict serializability, even in globally distributed clusters without clock synchronization. Along with that, FaunaDB can boast low write latency (under 200 ms on average) and 99.99% uptime. According to Fauna, “The use of Calvin also allows FaunaDB to implement a master-less architecture. With replicas in a cluster, geographically distributed across many locations, FaunaDB provides active-active transactions that allow applications to scale horizontally across the globe without a single line of code.”

FaunaDB implements a semi-structured, schema-free, object-relational data model that is a superset of the relational, document, object-oriented, and graph paradigms. The data model allows enforcing constraints, creating indexes, and joining across multiple document entities. It also offers polyglot APIs mediated by drivers for a number of different programming languages. In short, the FaunaDB data model allows you do whatever you need with your database in a unified way.

By contrast, Azure Cosmos DB implements separate relational, document, and graph layers, each with its own query language and API. Similarly, YugaByte DB implements separate relational, wide-column, and key-value plug-ins.

FaunaDB provides both administrative and application-level identity and security using tokens. You can access the database securely through API servers, or directly from mobile, browser, and embedded applications.

FaunaDB has undergone extensive Jepson testing and passed with flying colors after fixing about 19 issues that came up during testing. Jepson “is an effort to improve the safety of distributed databases, queues, consensus systems, etc.” One of the statements to come out of the Jepson report on FaunaDB summarizes the database’s transactional architecture:

FaunaDB is based on peer-reviewed research into transactional systems, combining Calvin’s cross-shard transactional protocol with Raft’s consensus system for individual shards. We believe FaunaDB’s approach is fundamentally sound...Calvin-based systems like FaunaDB could play an important future role in the distributed database landscape.

faunadb review 01 IDG

FaunaDB public cloud status for one day. There are three Amazon regions (east coast, west coast, and Europe) and one Google region (midwest). This reflects all activity on the cloud cluster, not just my activity.

FaunaDB query languages and drivers

FaunaDB currently supports two query languages, its own FQL and the open-source GraphQL. FQL is more capable, but GraphQL has more traction thanks to its use at Facebook, GitHub, and other prominent tech companies.

The easiest ways to test queries against FaunaDB are to use the FaunaDB Shell or the FaunaDB web console. You’ll see both of them in action in the Quick Start sections below.

FQL (Fauna Query Language) is an expression-oriented language with some characteristics of a functional programming language. FQL operates primarily on the schema types provided by FaunaDB, which include documents, collections, indexes, sets, and databases. If you compare FQL concepts to SQL concepts, FaunaDB documents correspond to relational rows, collections to tables, databases to schemas, and FaunaDB indexes to both SQL indexes and materialized views. FaunaDB sets are sorted groups of tuples.

The following is an example of an FQL query that creates multiple blog posts in the collection “posts” using the Map function, which applies a Lambda function serially to each member of the array.

Map(
  [
    "My cat and other marvels",
    "Pondering during a commute",
    "Deep meanings in a latte"
  ],
  Lambda("post_title",
    Create(
      Collection("posts"), { data: { title: Var("post_title") } }
    )
  )
)

GraphQL is an open source data query and manipulation language that provides declarative schema definitions and a composable query syntax. The following is an example of a GraphQL query against a database about Star Wars movies.

faunadb review 02 IDG

At left is a sample GraphQL query, and at right is the beginning of the data returned. Note that the data has the same shape as the query.

FQL is available through drivers for nine programming languages. Each driver is available as an import in its language’s standard library import interface. For example, the JavaScript driver is available as an NPM package and is imported with var fdb = require('faunadb').

All of the language drivers are open source. The Android, Scala, and Java bindings share a common JVM driver.

faunadb review 03 IDG

FaunaDB currently has 9 supported programming language-specific drivers. They are for Android, C#, Go, Java, JavaScript, Python, Ruby, Scala, and Swift.

FaunaDB use cases

Fauna has created white papers for real-time consumer apps, financial services, game development, and retail and e-commerce. In a 2018 technical white paper, Fauna describes successful FaunaDB application patterns based on customer usage: as a distributed ledger; as a distributed app back-end; for SaaS with multi-tenancy and QoS; to integrate legacy silos; to consolidate applications; to globally distribute data; to unify on-premise and cloud data; and to manage cross-workload access to shared data.

Another common use case for FaunaDB is as the storage layer for JAMstack apps.  JAMstack is a modern architecture that avoids web servers in favor of JavaScript, APIs, and markup. JAMStack apps often use Netlify (an all-in-one platform for automating modern web projects), React (a JavaScript library for building user interfaces), Gatsby (a site generator that emits React.js), Jekyll (a Ruby-based site generator that starts with Markdown documents), Hugo (a fast Go-based site generator), or Nuxt (a site generate that emits Vue.js).

FQL quick start

The FQL Quick Start can be run on a local Fauna command line shell or in the web shell found in the FaunaDB console.

faunadb review 04 IDG

The FaunaDB web shell has essentially the same functionality as the downloadable Fauna command-line shell. You can find the shell within the console once you have selected a database.

I did my FQL quick start in the Terminal of a Mac. I added a few exploratory commands not shown in the tutorial for clarity. I also worked around one or two obvious small omissions in the documentation, for example using the actual post IDs from my session rather than the IDs in the documentation.

martinheller@Martins-Retina-MacBook ~ % fauna help
faunadb shell

VERSION
  fauna-shell/0.9.8 darwin-x64 node-v12.6.0

USAGE
  $ fauna [COMMAND]

COMMANDS
  add-endpoint
  autocomplete      display autocomplete installation instructions
  cloud-login
  create-database
  create-key
  default-endpoint
  delete-database
  delete-endpoint
  delete-key
  eval
  help              display help for fauna
  list-databases
  list-endpoints
  list-keys
  run-queries
  shell

martinheller@Martins-Retina-MacBook ~ % fauna list-databases
listing databases
my_app
main_ledger
martinheller@Martins-Retina-MacBook ~ % fauna create-database my_db
creating database my_db

  created database 'my_db'

  To start a shell with your new database, run:

  fauna shell 'my_db'

  Or, to create an application key for your database, run:

  fauna create-key 'my_db'

martinheller@Martins-Retina-MacBook ~ % fauna shell 'my_db'
Starting shell for database my_db
Connected to https://db.fauna.com
Type Ctrl+D or .exit to exit the shell
my_db> CreateCollection({ name: "posts" })
{
  ref: Collection("posts"),
  ts: 1573056452245000,
  history_days: 30,
  name: 'posts'
}
my_db> CreateIndex({
...   name: "posts_by_title",
...   source: Collection("posts"),
...   terms: [{ field: ["data", "title"] }]
... })
{
  ref: Index("posts_by_title"),
  ts: 1573056468580000,
  active: true,
  serialized: true,
  name: 'posts_by_title',
  source: Collection("posts"),
  terms: [ { field: [ 'data', 'title' ] } ],
  partitions: 1
}
my_db> Create(
...   Collection("posts"),
...   { data: { title: "What I had for breakfast .." } }
... )
{
  ref: Ref(Collection("posts"), "248300322187903506"),
  ts: 1573056490060000,
  data: { title: 'What I had for breakfast ..' }
}
my_db> Map(
...   [
...     "My cat and other marvels",
...     "Pondering during a commute",
...     "Deep meanings in a latte"
...   ],
...   Lambda("post_title",
.....     Create(
.......       Collection("posts"), { data: { title: Var("post_title") } }
.......     )
.....   )
... )
[
  {
    ref: Ref(Collection("posts"), "248300337888232978"),
    ts: 1573056505030000,
    data: { title: 'My cat and other marvels' }
  },
  {
    ref: Ref(Collection("posts"), "248300337888234002"),
    ts: 1573056505030000,
    data: { title: 'Pondering during a commute' }
  },
  {
    ref: Ref(Collection("posts"), "248300337888231954"),
    ts: 1573056505030000,
    data: { title: 'Deep meanings in a latte' }
  }
]
my_db> Get( Ref(Collection("posts"), "248300322187903506"))
{
  ref: Ref(Collection("posts"), "248300322187903506"),
  ts: 1573056490060000,
  data: { title: 'What I had for breakfast ..' }
}
my_db> Get( Ref(Collection("posts"), "248300337888231954"))
{
  ref: Ref(Collection("posts"), "248300337888231954"),
  ts: 1573056505030000,
  data: { title: 'Deep meanings in a latte' }
}
my_db> Get(
...   Match(
.....     Index("posts_by_title"),
.....     "My cat and other marvels"
.....   )
... )
{
  ref: Ref(Collection("posts"), "248300337888232978"),
  ts: 1573056505030000,
  data: { title: 'My cat and other marvels' }
}
my_db>

GraphQL quick start

I did the GraphQL quick start online in the console. I don’t think that Fauna has added GraphQL capabilities to its command-line client, and I couldn’t see any reason to use a third-party GraphQL client app.

The GraphQL application for this quick start is a simple to-do list. The schema is as follows. You need to create or download this file to your computer for the next steps.

type Todo {
   title: String!
   completed: Boolean
}

type Query {
   allTodos: [Todo!]
   todosByCompletedFlag(completed: Boolean!): [Todo!]
}

You start the tutorial in the FaunaDB console, at the top-level dashboard.

faunadb review 05 IDG

The FaunaDB console shows you your databases and your usage, as well as offering a link to create a new database.

From there, you create a database.

faunadb review 06 IDG

Creating a database is just a matter of setting the database name. The priority did not matter for my use cases.

At a Glance
  • FaunaDB is a good choice of database for greenfield web or mobile apps that need to be available globally with low latency and serializable consistency.

    Pros

    • Low global latency in a strongly consistent database
    • High scalability
    • Offered as a public cloud service or a virtual private server
    • Multi-model database with all models available from a single query

    Cons

    • SQL queries are not yet supported
    • FaunaDB is not offered as an on-premises database
1 2 Page 1
Page 1 of 2