Deep dive into VMware's virtual infrastructure
VI 3 swims through our server consolidation test, demonstrating some amazing capabilities and a few quirks
The selling points of x86 server virtualization are by now common knowledge. By moving systems off dedicated, underutilized servers, and using virtual machines to consolidate them on fewer boxes, you can reduce power, cooling, and space requirements, and you can save a bundle in hardware costs. After the bean counting, VMs can help ease provisioning, load balancing, and disaster recovery.
Less understood is the path to achieving these gains. Once you’ve caught the consolidation bug, what’s really involved in making the move, both in terms of technical requirements and physical labor? And what kind of control do you have over the new environment? To find out, we brought the heavyweight champ of virtualization platforms, VMware Infrastructure 3 (VI3), in for a deep look, subjecting the software and a supporting team of VMware engineers to one of our real-world, Fergenschmeir test scenarios.
[There’s much more to VI3 than we could fit into the space of this particular article. See also our take on important and even invaluable VI3 capabilities such as VM snapshotting and cloning, VMware’s consolidation planning tool, and how the virtualization wave is driving innovation above and below the hypervisor.]
In the end, VI3 and the VMware team passed our test with flying colors, successfully migrating a number of Windows and Linux systems and impressing us with a wealth of useful tools and automated management capabilities. We also discovered some curious limitations in VI3, however, that made our path to a virtual infrastructure a little less straightforward than it otherwise might have been.
Taking the Plunge
Our test began on a bright October morning. The first order of business was to pick a free blade in our Dell PowerEdge 1955 blade server chassis, install Windows Server 2003, join that server to the domain, and install VMware VirtualCenter Server. This installation was straightforward, with all requirement packages present on the install CD. Although there weren’t any Infrastructure 3 servers to manage yet, the groundwork was laid. Next, the first VI3 server was built on a second blade in the Dell cabinet.
Like its predecessor, VI3 is built on a Linux base, leveraging the stability and light footprint of a highly customized Red Hat operating system to provide foundation elements, but relying on a VMware kernel and VMware I/O drivers and schedulers, to squeeze the most out of the hardware. The Linux folks will immediately notice that the installer is unabashedly built on Red Hat’s Anaconda, and installation is generally as easy as booting the CD and clicking Next a few times, ensuring that the required I/O devices are discovered and configured. In the case of our Dell server, I/O was limited to one gigabit front-end NIC and one gigabit back-end NIC for iSCSI SAN interaction. Within a few minutes, the first VI3 server was booting, and the gathered geeks toasted the achievement with a brief swig of Red Bull.
VM Control Center
When the first VI3 server was up and running, we installed the VirtualCenter client on a Windows XP workstation. Unlike the management tools of previous VMware platforms, such as GSX Server, the VI3 management tool base is Windows-only, built on a .Net platform and requiring the most recent Microsoft build. Luckily, the installer detects the current version and prompts the user to download and install the latest release from Microsoft. After this task was completed, it was the work of a few seconds to add the VI3 server to the management console.
VI3 server management in VirtualCenter is based on the familiar hierarchical view of many Microsoft-based tools, and provides a reasonable amount of sorting and organizing options, including multiple views of the available host servers, virtual servers, and clusters and groups. In the case of our Fergenschmeir Ltd. test scenario, implementing a VMware cluster was the way to go, since the advanced features of VI3 such as HA (High Availability) and DRS (Distributed Resource Scheduler) require a clustered environment. Luckily, this is as easy as right-clicking on the datacenter name defined during installation and adding a new, empty cluster. After that’s done, new VI3 hosts are simply added to the cluster — no other configuration is necessary.
In order to use services such as HA, DRS, and VMotion, every VI3 server needs shared storage in one form or another. Fibre Channel and iSCSI SANs are supported, as is standard NFS, but NFS comes with a performance hit. We brought the cluster together by carving a 600GB LUN from the available storage pool on the EqualLogic SAN, and masked to permit access from the dedicated iSCSI NIC on the VI3 server.
Fergenschmeir had implemented a dedicated network segment for iSCSI traffic, both to reduce bandwidth consumption of front-end segments and to enable jumbo frames. This iSCSI segment wasn’t routed, however, which presented some problems here. When running with iSCSI, VI3 annoyingly requires that the primary interface of the physical server and any dedicated iSCSI NICs have access to the iSCSI target in order to handle auto discovery. The network admin added this route in the core switch, and after we configured the VI3 host with the proper iSCSI target address, that 600GB LUN was suddenly visible. Notwithstanding the extra step during setup, the iSCSI support in VI3 is handled nicely, and it definitely performs well.
Migratory Patterns
With the first VI3 server built and ready for action, we could begin the first of our five P2V (physical-to-virtual) migrations. To handle these migrations, we used VMware’s P2V Assistant and a beta release of VMware’s new Converter product. In terms of architecture, these two tools couldn’t be more different.
P2V Assistant is based on an old Knoppix Live CD, which is booted on the source server. When (and if) all storage and network devices are discovered and configured, you can use a text-based menuing system from the source server console to migrate the server to a VM on a specific destination server. Interestingly, given its Linux roots, P2V Assistant has an easier time with Windows servers than Linux servers, and with newer server hardware than older gear. P2V Assistant had trouble detecting the RAID and NIC hardware in our Dell PowerEdge 2950 and 850 servers, but flawlessly inventoried an HP ProLiant DL360 G3. The migration from physical to virtual took only about 10 minutes, and correctly resized the destination disk, necessary because the source server had more than 60GB of unused space in the primary partition. As soon as the migration was finished, we booted the new VM and powered off the old server. Other than the downtime caused by booting the domain controller from the CD, we encountered no other problems
For the next migration, we put the new VMware Converter tool to the test. This tool offers both live and offline migration options. The live version runs as a stand-alone application on a Windows server. For this test, the Microsoft Exchange Server 2003 system was selected for migration. Converter has a simple interface that allows admins to supply a Windows server name or IP address and log-in credentials, modify settings for the destination VM, and optionally resize partitions. After that, it’s simply a matter of clicking a button to convert the server. During the course of the conversion, we added several users and Exchange mailboxes to the domain to see how complete a live conversion could be under normal operations. In production, you wouldn’t want to migrate any servers running databases, e-mail, or other datacentric tasks in this way, but it was a good opportunity to test the thoroughness of the tool. Of the three users added to the Exchange server, the first two, which were added during the first half of the migration, were present in the resulting VM. The last user, added when the migration was 85 percent complete, was present in Active Directory but missing a mailbox. In addition, the Exchange services failed to start when the Exchange Server VM was initially booted, but they did start manually, and the server appeared to suffer no ill effects from the migration. This form of P2V is certainly attractive, because it requires no real downtime, but should be used only in cases of largely quiescent servers, or for servers with static tasks, such as Web servers and file servers with external storage.
We also tried Converter’s offline mode. As with P2V Assistant, this requires booting the server from a CD into a limited OS. Unlike P2V Assistant, however, Converter’s CD is built on Windows PE, a curious decision. Converter required far more time to boot than the Linux-based P2V Assistant, and it didn’t offer any better hardware detection.
The next migration was the Fergenschmeir file server, a white-box server built from spare parts. Because the CD-based methods weren’t likely to recognize the mishmash of hardware, this server was migrated live with VMware Converter. On top of the live migration, we put the server under heavy load at the time, pushing nearly 100MB per second to simulated users. There was even a third wrinkle to this plan: the bulk of the files being served were residing on an iSCSI LUN.
This scenario presented two obvious options for conversion: Migrate the data on the iSCSI LUN into a VMware disk image, or simply map the LUN straight through to the resulting VM. Both options would retain the storage on the iSCSI SAN, but the second would permit the LUN to be bound by other servers. It turned out that VI3 also offers a middle route here, which is to wrap the existing iSCSI LUN in a virtual disk layer. To configure this, the LUN must be visible from all VI3 hosts, and the new disk built for the VM selected with Virtual Disk Mode, which adds the virtualization layer.
We chose this third option, and it worked well. In fact, the server’s performance held up well during the conversion, and the following reboot/powerdown required to complete the migration caused only a few seconds of downtime. The new VM came up normally, and because the iSCSI LUN set aside for file storage on the original physical server was already mapped to the VM, it immediately began serving files as if nothing had happened. The performance of the file server did suffer a dip in the new environment, as you would expect, averaging 85MB per second through a gigabit uplink to the network that was shared with other VMs. But, all considered, not too shabby.
Having successfully migrated all the Windows servers within an hour, we turned to the two Linux servers, a Dell PowerEdge 2950 running MySQL and a Dell PowerEdge 850 running a Web application linking to that database. Try as we might, none of the VMware tools would properly detect the hardware in these two servers, and the live transfer option was out, because Converter doesn’t support Linux.
Officially, VMware doesn’t support P2V migrations of Linux servers at all, which is a significant black eye, not only considering that VI3 is built on a Linux base, but also in light of VMware’s history of extensive Linux support. Fortunately, it’s far easier to migrate Linux servers manually than Windows servers, so we built two new VMs running identical configurations of 64-bit Red Hat Enterprise Linux 4, then copied over the databases and Web applications, all of which was the work of about 90 minutes. VMware does plan to officially support Linux P2V migrations in the near future.
VMs in Motion
After these base-layer tasks were out of the way, it was time to show off. We built two more VI3 servers on two more blades and added them to the cluster. These new servers were identical to the first server, right down to the licensing configuration. Because VI3 builds are based on Anaconda, most can be completely automated, and the automation functions are nearly identical to Red Hat’s Kickstart server provisioning tools.
As soon as the two new servers were brought online, we configured the cluster for DRS, VMware’s resource management framework. DRS manages server load by dynamically distributing VMs across multiple servers to take advantage of all the resources available in the cluster. Although enabling DRS is as simple as checking a box, there’s really more to it than that. Every server in the cluster must be configured identically, and all network interfaces and virtual switches must share the same names. Further, every VM must be built on shared storage — in this case, the iSCSI LUN — and VMotion must be enabled on every server.
VMotion is the magic behind VMware’s live server migrations, enabling a VM to be moved from one VI3 host to another without missing a beat. It works by migrating the control over the VM to another VI3 server in the cluster. This migration is achieved by remapping the storage pointer to the new host, moving the memory footprint of the running VM to the new host, sending out RARP (Reverse ARP) packets to inform switches that the MAC addresses assigned to the VM have moved, then actually switching to the new host. The transfer happens within a minute, generally, and it happens seamlessly; the VM doesn’t know the difference.