Downtime is ... good?

A healthy dose of planned downtime can save your bacon. Don't buy into the 24/7, always-on culture unless you absolutely must

Ask yourself: How do your users react when you announce (or plead for) a downtime window to accomplish an upgrade or to perform maintenance? Not well, I'd imagine.

Years ago, scheduled downtime was a common occurrence in all but the very largest IT shops, but today, few businesses let you get away with a solid downtime window without an act of Congress. Even some shops without obvious 24/7 requirements -- like three-shift manufacturing plants or hospitals with emergency rooms -- have a hard time denying their user base access to data even in the wee hours of the night.

The reasons for this are many, but they boil down to a voracious dependence on IT systems for day-to-day business -- and massively improved disaster avoidance brought about in large part by the advent of server virtualization. Businesses are addicted to data; technology has improved to the point that we in IT can readily feed that addiction.

This closed loop has an unfortunate, twofold effect: It creates an atmosphere where even the smallest request for planned downtime is often denied or delayed -- and users become entirely unprepared for what to do when disaster strikes.

The three joys of downtime

First, downtime can do a lot to help keep your environment solid. If you have to wait weeks or months to apply critical infrastructure patches, you're simply asking for trouble. While most systems in a modern IT infrastructure can be patched with very little downtime, with others, to keep up to date, you need to power down and inconvenience at least a few users.

Take your garden-variety switches and routers. They often sit untouched for years and work perfectly without interruption. In fact, one desktop aggregation switch I touched this past week had an uptime of more than 2,000 days. That's a huge testament to the manufacturer, but I'll bet you could drive several small vehicles through the easily exploitable security holes in that device's firmware.

Second, by taking advantage of planned downtime windows, you can exercise your high-availability capabilities and disaster recovery plans. If you rarely test your HA or DR capabilities, there's a much greater chance they won't work when you actually need them. As an astute reader commented on a blog post I wrote last year: "Nothing that is used less often than once a day works every time you use it. The less often you use it, the more likely it will fail when you do use it." In my experience, that couldn't be truer.

1 2 Page 1
Page 1 of 2