EMC VMware's ESX 3.0 was released a bit more than three years ago. While ESX 2.5 was a solid virtualization platform, ESX 3.0 seemed to push server virtualization into the realm where a lot of small and large businesses alike could really sink their teeth into it. The new high-availability features in ESX 3.0 were a huge draw to many businesses seeking better uptime, and the refined centralized management offered by VirtualCenter 2.0 was compelling. Support for a wider set of hardware such as iSCSI SANs also allowed high-end functionality at a lower price.
Now that we're three years down the road, many of these initial adopters of ESX 3.0 are starting to replace their hosts with new ones and preparing to upgrade to vSphere 4.0. That seems to be leaving a lot of server admins staring at a stack of three-year-old virtualization hosts that aren't yet finished doing their jobs. Sure, they might not be quite fast enough to go the distance with increased production loads, and you might like to have some more performance headroom, but it's always a painful decision to turn off a bunch of expensive servers and not do anything with them.
Instead of tossing their old hosts in a Dumpster, many enterprises are opting to reuse them. Some turn them into development clusters to separate dev loads from production loads. Some make them available for testing and training. My favorite use is as the seed hardware for a warm site. Even if the old hardware can't run all your production resources at 100 percent resource availability, having some immediately available production capability in a production site failure scenario is better than none -- and it bridges the gap between the time of the disaster and the time that you can get replacement hardware on site.
Assuming that business continuity is important to your organization and you have multiple offices or a sufficiently large campus, building a warm site is a great use of your hardware. It certainly isn't free and there are a number of common pitfalls that you'll want to steer clear from, but it's definitely a worthy endeavor if downtime costs you money.
Step 1: Define the service level
First, you need to define the level of service you want to grant with your warm site. Do you want to protect all of your machines or just a subset? How quickly do you want to be able to recover (RTO)? How old can your data be when you do recover (RPO)? Your answers to these questions may change as you work through the design process and start attaching price tags to varying levels of service, but you should never let what you can afford directly drive what you provide.
It may be that, to be useful, a warm site would cost more than you can currently afford to spend on it. In that case it's better to save your pennies and do it correctly than to implement something that won't accomplish your organization's goals.
Step 2: Assess your SAN situation for replication options
The SAN is the first piece of hardware that needs to be looked at, as it tends to be the most expensive. If possible, using asynchronous SAN-to-SAN replication is the best way to implement a warm site. Depending on the SAN platform in use, such replication might simply be impossible or uneconomical.