VMware vSphere raises the bar -- again

Storage admins can enjoy unprecedented control over hard-to-manage consolidated I/O with VMware vSphere 4.1

Server virtualization is the best thing to happen to the data center in the past 10 years. Lower hardware costs, less power and cooling, greater reliability, increased administration efficiency -- what's not to like? Now, with the latest release of its popular vSphere hypervisor, VMware has raised the bar again by providing greater visibility and control over storage performance.

Despite all its benefits, virtualization has also introduced new challenges that have proven especially difficult to deal with in heavily consolidated environments. Take managing storage performance in virtualized infrastructures -- most major hypervisors do a great job of allowing administrators to manage the amount of processor bandwidth, memory, and storage capacity that can be consumed by individual virtual machines. However, storage I/O -- especially transactional I/O -- has been nearly impossible to manage effectively.

Unlike memory and CPU, storage performance isn't a clearly defined resource. The virtualization hypervisor knows exactly how much CPU bandwidth and memory is available and can easily divvy that amount among VMs clamoring for those resources. Disk resources, however, might be shared by not only multiple virtualization hosts, but also by many other resource hogs -- both virtual and physical. Transient events such as backups, month-end report cycles, and RAID degradation due to disk failure muddy the waters even further. All of these factors combine to make managing storage performance a more difficult nut to crack.

Because there hasn't been any good way to apportion available storage performance over virtual machines, all virtual machines on a cluster of virtualization hosts have historically been left to fight among themselves over available storage resources. That fact alone has discouraged admins from mixing low-priority workloads that make heavy use of storage such as data mining applications with high-priority I/O users such as database and mail platforms. There's too much opportunity for the low-priority workloads to drown out the mission-critical workloads at times when the underlying storage resources become congested.

No longer -- among many other major improvements, VMware vSphere 4.1 can detect when storage resources are becoming congested and apportion available supply among virtual machines based on set priorities.

All of this is implemented through VMware's new SIOC (Storage I/O Control) feature set built into vSphere 4.1. The most important aspect of SIOC is that the hypervisor can now determine the available storage performance resources through the use of an administrator-set congestion threshold. Essentially, if the hypervisor detects high storage latency for a few seconds, it starts to throttle back the I/O the virtual machines are allowed to produce. Though high-load performance characteristics vary from one storage platform to another, once a certain latency threshold (usually 25ms to 35ms) has been crossed, overall array performance starts to decrease rapidly in most SAN environments. Restricting the I/O sent before that threshold prevents storage resources from becoming saturated -- which in itself is a great capability.

Across-the-board slowdowns, however, have obvious drawbacks, especially if you can't control which of the potentially thousands of virtual machines are allowed to use what resources. SIOC offers a solution by providing the ability to set I/O performance priorities down to the level of individual virtual disks. This means that when storage congestion occurs, lower-priority virtual machines can be given a smaller piece of available performance than higher-priority machines -- allowing mission-critical services to continue relatively unaffected while background processes feel the pinch. Coupled with SIOC are a new set of detailed storage performance graphs that make it easy to pinpoint high storage users and observe the effects of set storage rules.

From a storage administrator's perspective, the value of having this kind of capability is difficult to overstate. Being able to keep your virtualization environment from saturating your SAN while also controlling I/O on a per-machine or even per-disk basis is an amazing capability that's largely unparalleled even in the physical world. In short, VMware just transformed virtualization from being one of the most difficult storage performance consumers to manage to one of the easiest -- no small trick.

