Granted, licensing considerations come into play here. Because many virtualization frameworks license on CPU and RAM counts, deploying eight smaller-spec servers can cost considerably more than deploying four high-spec servers. The fact remains that by cutting too close to the bone with the physical platform, we deeply undermine our ability to handle outages and physical server problems. We wouldn't deploy mainline storage as a RAID1, but I've seen too many dual-server solutions that are essentially the same thing on the server side by reducing the server count.
We often hear about how reliable and resilient modern server hardware has become, how redundancy is built in from the power supplies to the hypervisor, and how we can reduce licensing, power, and cooling costs by running fewer, larger boxes. These are accurate points, but they're useless when a hardware or software event takes down a box. It's not "if," it's "when," no matter how resilient you believe your hardware to be.
A case in point might be a file system hiccup on a particular LUN that locks up the I/O subsystem on a server. Virtual servers on other physical servers might be unaffected, but at the very least the affected server will have to be restarted, and restoring lost or corrupted VMs from backups will likely be required. If there are only a few other boxes to take up the slack, this process becomes even more stressful, because suddenly the entire deployment is in jeopardy. If there are four or five other servers in the mix, then the pressure is reduced.
Don't think this example is far-fetched -- I had to deal with just such a problem a few weeks ago. Luckily there were eight servers in that cluster, and fixing the problem involved actually fixing the problem and bringing the three affected servers back up, not trying to triage the loss of dozens of VMs with a greatly reduced resource footprint before getting to the root cause.
If you find yourself considering a few huge boxes rather than several smaller boxes, remember that sometimes more is more. Although those few servers can easily support the virtualization load, they are likely to greatly impede future fixes and will make upgrades problematic due to the small number of servers to take up the slack. I'll take eight small boxes over three large boxes any day, and I will definitely sleep better at night because of it.
This story, "When server consolidation goes too far," was originally published at InfoWorld.com. Read more of Paul Venezia's The Deep End blog at InfoWorld.com. For the latest business technology news, follow InfoWorld.com on Twitter.