If you don't make a reservation, you're going to need to tip the maître d'

Demystifying reserved instances and understanding how you can save money on your cloud bill

The pricing models for compute resources in the cloud can be complicated. Some (but not all) variations include the following:

  • On-demand instances
  • Reserved/prepaid capacity
  • Spot instances
  • Dedicated instances

On-demand pricing is pretty straightforward: for every hour that a compute resource runs you pay a certain hourly cost. Reserved pricing allows you to significantly reduce your hourly cost by committing, and prepaying, to run a compute resource for an agreed period of time. Spot pricing allows you to establish a maximum bid price for a compute resource and, if there is a resource available at or below that cost, you pay the current spot price. And dedicated instances cost the most, but give you dedicated hardware on which to run your application.

With all of these options, how do you structure your compute strategy to guarantee that you have the resources you need, but minimize your cloud bill? In this post I review reserved pricing for prepaid capacity and the implications to your cloud bill.

Prepaid capacity and reserved instance pricing

Probably the most compelling use case for the cloud is its support for elasticity: If you need 10 virtual machines to support your load during off-peak hours but 100 to support your load during peak hours, the cloud affords you the capability to scale up and down on the fly to match your user demand. This elasticity allows you to not only support extreme peak periods, such as a Black Friday sale, but also to minimize your cloud expense by scaling down to as few virtual machines as you need. But, if you realize that you are always going to need a certain core number of virtual machines, you can reserve those virtual machines and save a lot of money.

Amazon supports prepaid capacity through reserved instances, Microsoft has discontinued its prepaid plan (you now have to work through an enterprise agreement to optimize costs), and Google Compute Engine offers committed use contracts. Both Amazon and Google offer one- and three-year terms, with discounts increasing the longer you are willing to commit. And these discounts can be substantial.

So how does it work? You reserve an instance by purchasing that instance for a one- or three-year term and then you pay for that instance for every hour in that term. For example, if you purchase a reserved instance for a one year term, you pay for every hour in a year, or 8,760 hours.

Let’s go through a concrete example of reserving a t2.micro instance for a three-year term on AWS. At the time of this writing, a t2.micro instance running Linux costs $0.012/hour if you use it on demand. You can reserved it for three years by paying $124, which, if you were to run this machine 24x7 for three years, or 26,280 hours, would yield an effective cost of $0.005/hour ($124 ÷ 26,280 hours), which is a 58 percent discount. You get the maximum amount of savings if you plan on running the machine continually for the complete term. If you were to run this machine for one year and then shut it down, you would have already paid the $124, but only run it for 8,760 hours, which would be an effective cost of $0.014/hour ($124 ÷ 8,760 hours), which is more than the on-demand price. The following chart shows this graphically.

reserved instance pricing Steven Haines

Reserving the right amount of resources

All the math in last section can be summarized as follows: You need to reserve the right amount of resources or you risk paying too much! If you reserve too few instances then you will be paying the on-demand price when you do not need to, but if you reserve too many instances, or instances that are not going to be fully used, then you reduce your effective cost. Choosing the right number of reserved instances is easier said than done, however.

The overall goal is to reserve enough instances that will run at a reasonable capacity, such as 75 to 90 percent CPU utilization. If you run machines that are constantly using, for example, 40 percent CPU then you have probably not optimized your workloads and, as a result, are running too many machines; effectively using the machines you are running can help you reduce the overall number of machines, and hence your cost.

To be effective at managing your cost bill, you need to assure the performance of your application while still optimizing your efficiency.

In summary:

  • Analyze your environment.
  • Analyze your workloads.
  • Optimize your virtual machine usage so that you are using the right amount of CPU, memory, network, and disk I/O.
  • Choose the right templates (AMI selection) for those workloads.
  • Reserve enough instances so that you have reservations for all virtual machines that constitute your steady state.

And if thinking through all this makes your head hurt, software can help so you can focus on more interesting pursuits.

This article is published as part of the IDG Contributor Network. Want to Join?