The virtual virtualization case study: Planning

In stage 3, Fergenschmeir's IT discovers that server virtualization planning is no walk in the park

Stage 3: Planning around capacity

After testing the server virtualization software to understand whether and where it met their performance requirements, Fergenschmeir’s IT leaders then had to do the detailed deployment planning. Infrastructure manager Eric Brown and CTO Brad Richter had two basic questions to answer in the planning: first, what server roles did they want to have; second, what could they virtualize?

[ Start at the beginning of Fergenschmeir's server virtualization journey ]

Brad started the process by asking his teams to provide him with a list of every server-based application and the servers that they were installed on. From this, Eric developed a dependency tree that showed which servers and applications depended upon each other.

Assessing server roles
As the dependency tree was fleshed out, it became clear to Eric that they wouldn’t want to retain the same application-to-server assignments they had been using. Out of the 60 or so servers in the datacenter, four of them were directly responsible for the continued operation of about 20 applications. This was mostly due to a few SQL database servers that had been used as dumping grounds for the databases of many different applications, sometimes forcing an application to use a newer or older version of SQL than it supported.

Furthermore, there were risky dependencies in place. For example, five important applications were installed on the same server. Conversely, Eric and Brad discovered significant inefficiencies, such as five servers all being used redundantly for departmental file sharing.

Eric decided that the virtualized deployment needed to avoid these flaws, so the new architecture had to eliminate unnecessary redundancy while also distributing mission-critical apps across physical servers to minimize the risks of any server failures. That meant a jump from 60 servers to 72 and a commensurate increase in server licenses.

Determining virtualization candidates
With the architecture now determined, Eric had to figure out what could be deployed through virtualization and what should stay physical. Figuring out the answer to this was more difficult than he initially expected.

One key question was the load for each server, a key determinant of how many physical virtualization hosts would be needed. It was obvious that it made no sense to virtualize an application load that was making full use of its hardware platform. The initial testing showed that the VMware hypervisor ate up about 10 percent of a host server’s raw performance, so the real capacity of any virtualized host was 90 percent of its dedicated, unvirtualized counterpart. Any application whose utilization was above 90 percent would likely see performance degradation, as well as have no potential for server consolidation.

But getting those utilization figures was not easy. Using Perfmon on a Windows box, or a tool like SAR on a Linux box, could easily show how busy a given server was within its own microcosm, but it wasn’t as easy to express how that microcosm related to another.

For example, Thanatos -- the server that ran the company’s medical reimbursement and benefit management software -- was a dual-socket, single-core Intel Pentium 4 running at 2.8GHz whose load averaged at 4 percent. Meanwhile, Hermes, the voicemail system, ran on a dual-socket, dual-core AMD Opteron 275 system running at 2.2GHz with an average load of 12 percent. Not only were these two completely different processor architectures, but Hermes had twice as many processor cores as Thanatos. Making things even more complicated, processor utilization wasn’t the only basic resource that had to be considered; memory, disk, and network utilization were clearly just as important when planning a virtualized infrastructure.

Eric quickly learned that this was why there were so many applications available for performing capacity evaluations. If he had only 10 or 20 servers to consider, it might be easier and less expensive to crack open Excel and analyze it himself. He could have virtualized the loads incrementally and seen what the real-world utilization was, but he knew the inherent budgetary uncertainty wouldn’t appeal to CEO Bob Tersitan and CFO Craig Windham.

So, after doing some research, Eric suggested to Brad that they bring in an outside consulting company to do the capacity planning. Eric asked a local VMware partner to perform the evaluation, only to be told that the process would take a month or two to complete. The consultants said it was impossible to provide a complete, accurate server utilization analysis without watching the servers for at least a month. Otherwise, the analysis would fail to reflect the load of processes that were not always active, such as week and month-end report runs.

That delay made good technical sense, but it did mean Eric and Brad couldn’t meet Bob’s deadline for the implementation proposal. Fortunately, Craig was pleased that an attempt to make the proposal as accurate as possible was being made and his support eventually made Bob comfortable with the delay.

The delay turned out to be good for Eric and Bob, as there were many other planning tasks that hadn’t even come close to completion yet, such as choosing the hardware and software on which they’d run the system. This analysis period would give them breathing room to work and try to figure out what they didn’t know.

When the initial capacity planning analysis did arrive some time later, it showed that most of Fergenschmeir’s applications servers were running at or below 10 percent equalized capacity, allowing for significant consolidation of the expected 72 server deployments. A sensible configuration would require eight or nine dual-socket, quad-core ESX hosts to comfortably host the existing applications, leave some room for growth, and support the failure of a single host with limited downtime.

The rest of the virtual virtualization case study
Introduction: The Fergenschmeir case study
Stage 1: Determining a rationale
Stage 2: Doing a reality check
Stage 4: Selecting the platforms
Stage 5: Deploying the virtualized servers
Stage 6: Learning from the experience