Last week, I gave a three-hour workshop at Techmentor Orlando on the fundamentals of storage virtualization. The primary focus was a review of the concepts and technology that make our storage (and disaster recovery) world what it is: an ever-changing, acronym-rich headache. RAID, SAN, NAS, VTL, iSCSI, and more all make up the current infrastructure. At times during the discussion I opened it up for the audience to give their storage and disaster recovery experiences, and I have to say that some of them were the tales of horror you might only hear around a campfire (well, a campfire of IT admins).
Jarred Fehr, a system administrator at Peachtree Business Products, said, "About three years ago, our company had purchased a new DAS array to replace our aging one. We decided to buy a unit from a different vendor than we usually use due to better cost/storage ratio. Unfortunately, our unit had a batch of bad drives from a well-known drive maker. After one month in production, there were multiple drive failures in one night and we lost all of our data. Even though we had backups of everything, it still took a full week to restore it all and return business to normal. Now we have redundant servers and arrays to prevent such loss in the future." Sounds like the backup saved the day.
[ Read J. Peter Bruzzese's related columns "Keeping pace with disaster recovery" and "With pandemic alert, firms urged to review disaster recovery." ]
Rick Calmes from the Air Force Institute of Technology relayed an experience from a while back, but one that left an indelible mark on his disaster recovery mindset. He reported:
We were decommissioning an old NetApps device that was using multiple arrays of SCSI drives. We discovered that the array would physically fit into the DEC Alpha. So upon further investigation we scrounged a SCSI card that would fit into the DEC Alpha as well. (Being PC admins, we did not realize how proprietary the DEC machines were.) So we power it all down, install all the hardware and cabling, and hit the power button.
Well, the machine goes through its post and we wait and then the dreaded screen: no OS found. Not sure what had happened, but the result was that we no longer had an operating system or a mail store. So we powered back down, removed all the hardware, card, and cabling, but it was too late as the damage had already been done, as we discovered when we tried to restart the Alpha again. Major oops!