Credit Suisse plans virtualization a massive scale

A process of elimination cut half of the financial giant’s servers from consideration – but that still left 10,000 to virtualize

With 20,000 servers to manage, financial services powerhouse Credit Suisse had a long list of reasons to consider server virtualization: reducing the number of physical servers to manage, cutting power needs, improving software provisioning time, and deferring expensive datacenter buildouts. But it also needed a clear set of guidelines to determine when to virtualize, plus a clear set of procedures for managing a virtualization initiative.

Credit Suisse began by eliminating servers as candidates for virtualization. For example, the company had already created server efficiencies by sharing instances of Web servers on one box and sharing databases on another box -- both time-honored, proven techniques, notes Stephen Hilton, managing director for enterprise server and storage. "Putting a hypervisor there doesn't necessarily help you," he says, because Credit Suisse had already raised utilization rates and reduced hardware needs for those applications.

Other sets of servers just didn't make sense for virtualization, Hilton says, including I/O-intensive servers, servers with specialized add-on hardware, and servers whose transactional applications had very tight processing windows, where the overhead of virtualization added milliseconds that would cause timing problems.

That still left a large pool of virtualization candidates -- about 10,000 servers, in fact, most running either Windows or Solaris. In general, their utilization was low, particularly those used in development and test environments where, in both cases, the boxes tended to have more horsepower than needed. That low utilization is apparent in Hilton's expectations of how many VMs he will get per physical server: at least a 20:1 ratio for servers in the development environment; 15:1 to 10:1 for the test and disaster recovery environment; and 5:1 in the production application environment. Hilton's team is now in the process of virtualizing these servers, with plans to be done with 5,000 by early 2009. The group has already virtualized 1,000.

In crafting Credit Suisse's server virtualization strategy, Hilton decided to take a page from the storage virtualization playbook and conceive of the environment as a shared service, not just as a collection of VMs. "We created a complete hosting platform from which we 'sell' slices of capacity," he says. That meant treating physical servers as parts of a bigger resource pool from which capacity could be pulled as needed.

Accomplishing that meant the storage associated to the servers also had to be malleable -- a perfect fit for the datacenter's SAN. But a SAN alone didn't go the whole distance, Hilton notes. He brought in thin provisioning, which let him logically account for physical storage capacity he didn't have yet, so when additional storage was needed, it was added without having to change the logical storage pool, a nontrivial effort. "When you have hundreds of VMs, it would be very painful to the cluster management without thin provisioning," he says.

Deploying virtual servers on a SAN brought its own complications. "We learned that the I/O characteristics of a SAN that has a lot of VMs running are very difficult at startup," Hilton notes. When a physical servers starts, a dozen or more virtual servers all start as well, using not only physical I/O within the server blade but also on the SAN. "When you have 50 to 60 VMs on eight [physical] servers, they're chatty," he says. That situation required a change in how the physical servers connected to the SAN and the startup timing of VMs on a host to better distribute those I/O demands.

The SAN also needed to be resized and rebalanced for the steeper failover I/O in a virtual-server environment. The issue, Hilton says, is that a dozen server images take a lot of storage and I/O all at once if their physical server fails and they need to be moved. Essentially, there's a price to be paid in the SAN infrastructure to support the higher density of multiple VMs on a physical server.

The other deployment strategy Hilton needed to work out involved the ongoing provisioning and maintenance of the virtual servers: "It's so easy to provision a VM, so how do you ensure that an enterprise doesn't create them willy-nilly?" Credit Suisse's solution was to automate the provisioning. All requests go to a central tool that tracks the requests, available resources, and so forth, so IT can track usage easily and be alerted when potentially abusive requests occur, such as multiple VMs requested by the same person or department in a short period. Essentially, the VM and the resources that they reside in are treated as inventory to be managed.

The automated system also let Hilton introduce the concept of leasing VMs to developers. "They get it for 30 days, then it disappears. Otherwise, it would live forever, as it did with physical servers. You can't be a box-hugger any more," he says.

Virtualization special report

Case study: Nationwide Insurance

Case study: Purdue University

Case study: Stonebridge Bank

Case study: Transplace