How to maintain availability when using multiple AWS accounts

If you are using multiple AWS accounts, you can’t assume that two different availability zones reside in different data centers

How to maintain availability when using multiple AWS accounts
Thinkstock

When building a modern, high-performant application at scale, it’s important to make sure the individual application instances are distributed across a variety of data centers in such a way that if any given data center goes offline, the application can continue to function relatively normally. This is an industry-wide best practice, and an important characteristic to architect into your applications in order to make them sufficiently resilient to data center problems.

The same philosophy occurs when you build your application in the cloud. Except, when you build a cloud-based application, you typically do not have visibility into which data center a particular server or cloud resource is located. This is part of the abstraction that gives the cloud its value. However, not having visibility into which data centers your application is operating in makes it difficult to build multi data center resiliency into your applications. After all, if you don’t know what data center your application is running in, how can you ensure that it is running in multiple data centers?

Fortunately, cloud providers such as AWS have a solution to this problem. AWS created a cloud abstraction of the data center that allows you to build on this level of resiliency without being exposed to the details of data center location. The abstraction is the availability zone.

AWS availability zones

An AWS availability zone is an isolated set of cloud resources that allows specifying a certain level of isolation into your applications. Resources within a single availability zone may be physically or virtually near each other, to the extent that they can be dependent on each other and share subcomponents with each other. For example, two EC2 servers that are in the same availability zone may be in the same data center, in the same rack, or even on the same physical server.

However, cloud resources that are in different availability zones are guaranteed to be separated into distinct data centers. They cannot be in the same data center, they cannot be in the same rack, and they cannot be using the same physical servers. They are distinct and independent from each other.

Within a single region, however, the availability zones are connected to each other by very high speed, low latency network connections so that resources in multiple availability zones may work together in a coordinated fashion as needed.

Hence, the solution to the resiliency problem. In an AWS cloud based application, to have the same level of resiliency that you can have with a multiple redundant physical data center based application, you can build your application to live in multiple availability zones. If you construct your application so instances of your application are distributed across multiple availability zones, you can isolate yourself from hardware failures such as server failures, rack failures, and even entire data center failures. Using multiple availability zones allows you to build in application resiliency.

Availability zones as data centers

Loosely, availability zones can be thought of as data centers. At a first level of approximation, this is a roughly reasonable assumption to make. But there can be danger in that assumption. First, there is not a one-to-one mapping of availability zones to data centers. When you create your AWS account, your availability zone names are mapped to individual data centers in a dynamic fashion. This means one AWS account may have an availability zone named us-east-1a mapped to data center #4, and another account may have the same availability zone mapped to data center #2.

Worse yet, a given data center may map to different availability zones in different accounts. For example, data center #4 in account #1 may be used for availability zone us-east-1a, but the same data center in account #2 may be used for availability zone us-east-1b.

You can find out how your availability zones are mapped to specific data centers in a given account by looking in the AWS console in the Resource Access Manager (RAM). In the console, select “Resource Access Manager” under the Services menu. On the lower right hand side, you’ll see a display that looks like this:

aws resource access manager IDG

This shows a mapping of availability zone names to an AZ ID. An AZ ID is a unique identifier that can be effectively used as a data center identifier. It shows the mapping, for your current account, of each availability zone to its associated data center’s AZ ID. This mapping is shown for your currently selected region, but you can simply switch regions to show the mapping for any region in your account.

In the above example, for this account, the availability zone us-west-2b maps to the data center with AZ ID of usw2-az1. In another account, there will be a different mapping.

Cryptic AWS status messages

Ever wonder why, when AWS announces a problem on their status page, they will often say the problem “impacts one or more availability zones” in a given region? They never say which availability zones! The reason for this is due to this mapping. When a problem exists on their site, it exists in one or more data centers. The actual availability zone names associated with those data centers may differ from account to account. Hence, on a status message that is shared broadly, they cannot know which availability zone will be impacted for any given user. This is the reason for the more cryptic message.

Why AZ ID mapping is important

This mapping is normally hidden from your view and handled transparently by AWS. For the most part, this is reasonable and acceptable. However, you can run into a problem when your application makes use of multiple AWS accounts. Since availability zone names are randomly assigned to data centers on a per account basis, this means that a given availability zone in different accounts may map to different data centers.

Now, this doesn’t seem too bad. But it also means that two different availability zone names in two different accounts could both map to the same data center! This can be a problem for availability purposes.

What that means is, if you are using multiple accounts, you can no longer assume that two different availability zones are guaranteed to be in different data centers. This makes it hard to implement the multiple data center best practice discussed earlier.

A solution for multiple AWS accounts

If you are using multiple accounts, and you want to guarantee data center uniqueness across accounts, you cannot use the availability zone name. How, then, do you guarantee your application resides in independent data centers for resiliency purposes?

The answer is to not use the availability zone as your method of enforcing data center independence. Instead, you should use the AZ ID. If two availability zones in different accounts have different AZ IDs, you can be sure that those two availability zones are in distinct data centers. Using the AZ ID, rather than the availability zone name, is a safe way to ensure your applications live in distinct data centers across multiple accounts.

It’s important for availability purposes to ensure that your application makes effective use of multiple data centers for redundancy. To ensure data center independence for a large application that spans multiple AWS accounts, you cannot use availability zone name as your verification check for independence. Instead, use the AZ ID. Failure to do this can result in architecting an application that has unexpected and undesired internal infrastructure dependencies that could negatively impact your application’s availability.

Lee Atchison is the senior director of cloud architecture at New Relic. For the last seven years he has helped design and build a solid service-based product architecture that scaled from startup to high traffic public enterprise. Lee has 32 years of industry experience including seven years as a Senior Manager at Amazon.com. At Amazon, he led the creation of the company’s first software download store, created AWS Elastic Beanstalk, and managed the migration of Amazon’s retail platform to a new service-based architecture. He is author of the book “Architecting for Scale,” published in 2016 by O’Reilly Media.

New Tech Forum provides a venue to explore and discuss emerging enterprise technology in unprecedented depth and breadth. The selection is subjective, based on our pick of the technologies we believe to be important and of greatest interest to InfoWorld readers. InfoWorld does not accept marketing collateral for publication and reserves the right to edit all contributed content. Send all inquiries to newtechforum@infoworld.com.

Copyright © 2019 IDG Communications, Inc.