Don't turn that machine off!

Tropical locales can be exciting. But not when they're in the server room. In the late 90s I worked as a tester and developer in a regional office of a larger corporation. After a couple rounds of corporate "right-sizing," I was tasked with rotating the daily backup tapes and getting them sent to off-site storage. All of the source code for several business-critical customer applications were stored in SCCS on a

Tropical locales can be exciting. But not when they're in the server room.

In the late 90s I worked as a tester and developer in a regional office of a larger corporation. After a couple rounds of corporate "right-sizing," I was tasked with rotating the daily backup tapes and getting them sent to off-site storage.

All of the source code for several business-critical customer applications were stored in SCCS on an ancient IBM AIX box. There was a very large, handwritten sign on the monitor: "DO NOT POWER OFF." I asked someone about the message. I was told that the machine and drives were so old that there was a legitimate concern that they would not power up again. The hardware and OS version were also no longer supported by the vendor.

It was unbelievably hot in that top floor room. The building was originally a canning factory converted to office space. The top floor had horrible ventilation and was truly unbearable in the summer. The CFO told us it was impossible to put AC in the area. Of course, what he meant was, "it costs more money than we want to spend." This is the same room that held the servers storing our code, not exactly good for the machines. You could hear the whine of a dying machine every time you walked into the room.

Less than a year earlier we all had to vacate the room for several days as a major roof leak caused serious problems. When we came in they gave us some plastic tarps to cover the computers and said they would call someone about the roof the next day. There appeared to be no sense of urgency from anyone. And for budgetary reasons, replacing the machines was not an option.

Management never really understood the risks and what was at stake. IT people were considered second-class "worker bees." (One time, when working to elicit performance requirements for a new system, I was told that all response times should be "instant"!)

Anyway, each day I would take the tape out of the machines tape drive, swap it in the rotation, and get the tapes sent off-site weekly. One day I asked management -- most of which was comprised of non-IT people -- how we might ever restore the data if the AIX machine that was circling-the-drain eventually died.

The first response was a blank stare. Slowly, though, he came to the realization that backups with no ability to restore are useless. And this was only a few months after the head IT admin guy was let go and refused to hand over any of the passwords to servers.

I don't what ever happened in that sweltering room with the leakage problem because I left before anything could blow up. But I sure got a cold, hard look at what a potentially lethal combination indifference and incompetence can be.

Join the discussion
Be the first to comment on this article. Our Commenting Policies