As I was turning up a test instance with a well-known cloud provider, I ran some tests of the underlying server. Essentially, I used Apache's ab benchmarking tool to measure the performance of Nginx on the host, so I was hitting the server from itself, requesting the same PNG file 3,000 times, with 20 concurrent connections and a second rest between each test run. The Nginx configuration was set to cache these files, so it was simply pulling the image out of RAM and shipping it, no disk I/O involved.
Over the course of six test passes, I witnessed a high of 7,749.1 requests per second and a low of 4,754.43 requests per second. The average across all tests was 6,499.19 requests per second. This represents a substantial range in results, making it a challenge to forecast scalability metrics.
On the other hand, running the same tests against an in-house VM with the same number of vCPUs and RAM, I saw a high of 15,176.66 requests per second and a low of 14,507.47 requests per second, with an average of 14,829.52 requests per second. These are obviously much more consistent results. They're also nearly three times higher than the results from the cloud instance.
The CPUs in use were different, which accounts for some of the disparity. (The in-house VM was using Intel Xeon E5-2670 CPUs at 2.6GHz with 20MB cache, while the cloud instance was running on AMD Opteron 4332 HE CPUs at 3.0GHz with 2MB cache.) But that's not the whole story. I should have seen equally consistent results, however slower, on the cloud instance. Instead, I have a range of almost 50 percent of the average result. (This compares to a range of less than 5 percent of the average for the in-house VM.)
Cloud servers are generally sold by the vCPU count, RAM, and bandwidth utilization, but clearly not all instances are created equally. Even if they were, the capacity of the underlying hardware can vary wildly depending on where the instance happens to be. The solution from a purely operational standpoint is to overbuild your cloud infrastructure to account for these large performance disparities, but that has its own pitfalls, including extra costs.
Also, even if the instance is overbuilt, performance can suffer if the load balancer is trying to level an incoming load to several instances performing at different levels, even though they're ostensibly identical in spec. If the load balancer is directing traffic to the least loaded instance measured by number of connections, that instance may very well be underperforming as compared to other "identical" instances, which may be handling more connections but are actually operating faster.
There is no good solution to this, other than maintaining vigilance and pressing your cloud provider to deliver what was promised. I'd recommend running scheduled performance tests on your instances to check their performance levels over time and using these results as ammunition in discussions with the provider.
The reasons for using the cloud are many, but cloud servers are certainly not the hands-free panacea they might seem to be. You may reduce some responsibilities, but you will gain others.
This story, "Don't count on consistent server performance in the cloud," was originally published at InfoWorld.com. Read more of Paul Venezia's The Deep End blog at InfoWorld.com. For the latest business technology news, follow InfoWorld.com on Twitter.