By definition, enterprise-class primary storage is rock solid. It'd better be -- without bulletproof storage, the rest of the application infrastructure built on top of it can't hope to be reliable. Applications have enough problems of their own without flaky storage making matters worse. That's why enterprises spend huge portions of their IT budgets to buy the best, most reliable storage infrastructure they can afford.
Redundant disks, redundant controllers, mirrored cache, and redundant storage networking fabrics go a long way toward delivering the kind of fault-tolerant storage infrastructures we've come to expect in mission-critical environments. But even the most highly redundant storage architecture out there can't protect itself from one threat: you.
Lest you be offended, "you" includes me, too. Among the huge number of enterprise storage devices I've laid hands on over the years, only one ever crashed catastrophically due to hardware failure. On the other hand, I've lost count of the number of outages I've seen caused by bad documentation, bad tech support advice, insufficient training, and software or firmware that somehow overlooked the fact it might be used by actual human beings one day.
As if to underscore this phenomenon, I've witnessed two primary storage environments crumble to the ground within the past month. Cue the ominous music.
In the enterprise storage infrastructure, users are supported by two separate yet equally important groups: the devices that store their data and the people who manage them. These are their stories.