In the case of Hurricane Irene, I opted to remotely shut down two data centers in two different states that were in the path of the storm, leaving them with only the switching and VPN gear running. Naturally, almost every element of these data centers can be remotely controlled, from turning servers on and off to gaining console access to every relevant device on the network, including storage controllers, core switching, and so forth. Shutting down the data centers was the work of only half an hour, with scripted tools to turn off every Linux server in a specific order -- and the widespread use of virtualization made it absurdly simple to deactivate all the VMs gracefully.
Unfortunately, the other site didn't fare quite as well. The shutdowns were planned for 3 p.m., but that site magically lost power at 11:45 a.m., well before the storm hit, and lacked generator backup due to regulations and site issues. I ended up feverishly shutting down servers from my iPhone in the middle of a parking lot. I got to about half the servers with the shutdown scripts, but the Windows boxes were left to fend for themselves, as was the storage. The last I saw of that data center was a truncated SMS warning about the monster UPS losing batteries. Then it was gone. Poof. This particular site was 250 miles away, so reviving it would have to wait until after the storm blew through.
When the lights came back on, the second data center started itself back up. With the exception of the boxes I'd managed to shut down normally, the other servers automatically powered themselves on when power was restored, as they were instructed. The networking gear came up normally, as did all the storage. In fact, other than a few situations caused by the out-of-order power-up, the site performed admirably. I had to turn on a few servers manually, remount NFS mounts that had failed due to the storage not being immediately available when other servers booted, and kick over some VMs, but that was it.
The data center that was shut down in an orderly fashion came up just as nicely, with only a smattering of minor issues. Prior to the hurricane, I obviously hadn't planned on performing a true shutdown test scenario that weekend, but I had just completed one, and both sites passed with flying colors. This little exercise also highlighted a few small gaps in the monitoring framework that were easily found and fixed.
If you run a data center that can be forced down completely without causing significant negative impact on normal business operations, you should probably plan a complete power-off exercise sooner rather than later. I always do this when building out a new facility, but after that it's a rare event, usually caused by outside elements. All said, this particular forced power-down increased my confidence in the resiliency of both sites. For me, that was the slim silver lining to Hurricane Irene's clouds.
This story, "Lights out: When to power down the data center," was originally published at InfoWorld.com. Read more of Paul Venezia's The Deep End blog at InfoWorld.com. For the latest business technology news, follow InfoWorld.com on Twitter.