How to make the most of Azure Cosmos DB’s free tier

Microsoft has added a free option to Azure’s distributed database. Let’s jump in

How to make the most of Azure Cosmos DB’s free tier
Thinkstock

Azure’s Cosmos DB is one of its best features. A multimodel distributed database, it gives you a foundation for building truly cloud-native applications with a series of consistency models that can be mapped to how your application works. But it’s not easy to get started, and a badly configured or designed application can quickly get expensive.

It’s good to see that Cosmos DB now has a free tier that can help you start deploying applications outside of a limited development environment. The new tier isn’t large: it’s based on the minimum configuration for Cosmos DB, and offers 400 RU/s (request units per second) and 5GB of storage, with as many as 25 containers in a shared throughput database. That’s more than enough for a small application which offers more reads than writes, for example, and isn’t reliant on strong consistency models.

You do need to be aware that although Cosmos DB is multiregion, you can only run a single 400 RU/s database in the free tier. In practice that limits you to a single region, as additional regions will each need their own 400 RU/s instance, and those will be charged at standard rates for those regions, per hour.

Getting started with the free Cosmos DB

You will need to create a new account to take advantage of the free tier; it’s not available as a billing option on existing applications. The free tier’s 400 RU/s is the smallest amount that can be provisioned in a Cosmos DB database. That gives you around 1 billion reads a month, which should be enough to get your application off the ground or allow you to deploy and run an internal distributed database as part pf a pilot project. Once you get to the edge of your free RU/s allowance, you can add more capacity in blocks of 100 RU/s, billed at an hourly rate.

It’s worth understanding what a Cosmos database RU is. The RU is a request unit, and the billed RU/s is a measure of the provisioned throughput of your database, covering all its operations. That includes reads, writes, updates, deletes, and more. Microsoft suggests that 1 RU/s is equivalent to one eventually consistent (the slowest and least processing-intensive level of consistency available on Cosmos DB) per second of a 1KB item. To write the same 1KB item per second is 5 RU/s. The more complex the operation, the more RU/s it consumes.

Understanding the consumption of request units

It’s hard to say exactly how many RU/s an application will consume. However, you can think about the Cosmos DB constraints that can affect the RU/s used by your database. First, you need to consider the size of your items. The larger the item, the more RU/s it uses for a read or a write. Similarly, indexing consumes RU/s, and if you use the default indexing model, the resources required to write items will go up as you add more to your database. Then there is your choice of consistency models, with both strong and bounded staleness needing roughly twice as many RU/s for a read as Cosmos DB’s other, less strict models.

With a limited number of RU/s available in the free tier, you may want to work around those constraints to keep consumption to a minimum. One option is to turn off all indexing for your database, though in practice you may prefer to limit indexing to specific properties on each stored JSON document. At the same time, you need to consider how your application is operating and whether it’s better to use something like session consistency to improve user perceptions of performance while reducing the RU/s used.

As RU/s are activity based, you can use query design to keep consumption to a minimum. That might entail limiting the number of results per query, controlling the amount of data you store, or using as few user-defined functions, stored procedures, and triggers as possible.

Setting up your database is easy enough. In the Azure Portal create a new Cosmos DB account, and from the Azure Data Explorer create a new database. Start by giving it an ID and then provision its throughput. Set this to 400 RU/s. Higher amounts will show cost estimates, but as you’re setting up a free instance there’s no need to try this out. You’re not limited to the Portal; you can use the Azure CLI, PowerShell, or even programmatically from inside the Cosmos DB SDK.

Building apps on Cosmos DB’s free tier

In Cosmos DB a database is a set of containers, which are used to handle partitioning in an Azure region and distribution across the regions you’re using your database in. Each database can be configured to be a specific model: NoSQL (both MongoDB and Cassandra), SQL, Gremlin, and tables. Most apps will work with it as a NoSQL document database storing JSON data.

Once you’ve set up a database and chosen a model, you can think of a Cosmos DB container as how the database scales. Outside of the free tier, you can set throughput in RU/s on a container basis; in the free tier you’re sharing that throughput across all the containers in your database, so you can’t predict throughput for any specific container. Paid instances have an associated SLA, which is why they allow you to set throughput on a per-container basis.

Working across containers this way is equivalent to using a cluster in a NoSQL database and works well for this type of workload. By using the same partition key across all your containers, Cosmos DB will automatically share throughput across them. You can use this approach with the free tier’s 25 containers to reduce bottlenecks for your application’s users. If you treat it as a sharded, clustered NoSQL database, you should find it relatively easy to include it in your applications, using it to host pointers to other content rather than the content itself.

Working with a free service offering can be tricky, but if you take sensible precautions it should be possible to use Cosmos DB’s new tier as part of an application back end. You may have to sacrifice some of the service’s scalability features, but that shouldn’t affect applications significantly if you make careful design-time decisions.

It’s important to think about how to take advantage of a distributed database like Cosmos DB rather than simply porting your existing workloads to it—they are unlikely to make a good match. Instead, think of this as your opportunity to build a truly cloud-native, distributed application. In this case 400 RU/s is more than enough to bootstrap a new application and get it working with a reasonable number of users.

Copyright © 2020 IDG Communications, Inc.