I've been working in IT for close to 20 years, and the average experience of my team's members is 15 years. So with over a quarter of a century's worth of combined experience, you would think business managers would trust our decisions, or at least take our input into account. Unfortunately, the reality is that sometimes we get ignored.
Our IT department is in the United States and supports all our remote sites. Each site is overseen by a manager, and we are required to get the manager's approval before we implement any major changes. Usually, the managers listen to us and voice any concerns, then we make the changes together.
[ Want to cash in on your IT experiences? Send your story to firstname.lastname@example.org. If we publish it, we'll send you a $50 American Express gift cheque. ]
However, one of the managers continually fights us. For example, we have a remote site in EMEA with an aging infrastructure. The file server is more than 7 years old, the domain controller is 8 years old, and both are out of maintenance. Here's why.
About three years ago, our IT team proposed using a caching appliance that would allow us to keep the primary copy at our core datacenter so that we could use ILM (information lifecycle management) tools to migrate old data to cheaper storage, centralize backups, and still have good performance for the remote site. Before rolling out any changes to the sites, we tested it between our main engineering site and the datacenter over an OC3 with very low latency and deployed one to APAC, which has an E1 with very high latency.
[ Tired of being told to do more with less? Participate in InfoWorld's Slow IT movement: Rant on our wailing wall. Read the Slow IT manifesto. Trade Slow IT tips and techniques in our discussion group. Get Slow IT shirts, mugs, and more goodies. ]
When we got ready to deploy to the EMEA site, the site manager put the brakes on. "I'm not convinced you tested it enough," he said. We explained the testing we had done, the phased approach, and the backout plan, but to no avail. He wasn't going to let us do it. Since he's a supply chain ops guy with no real IT background, we figured he didn't understand the risks of not deploying to the site, so we took time to explain them in detail and to ask questions and think it over. Nothing doing. Our manager talked to him, but he was resolute that the changes would cause major problems for his operations.
A year ago, the caching appliance company had been bought by someone else, and it killed the product we were using. So we upgraded the remote sites' infrastructure to use the caching model with the data stored and managed centrally, but cached locally. We switched tactics and installed an ESX cluster with local iSCSI storage. It could survive a single hardware failure with no impact, and since everything was local, performance rocked. Backups were done using replication back to the core datacenter.
Unfortunately, the site manager in EMEA was concerned about performance on the virtual machines. We explained how performance was actually better on the other 100 machines we had virtualized, since the new hardware was much faster, and even consolidating the 12 servers to 3, he would see much better performance and resiliency. Apparently, he read an article somewhere that was contrary to what we were saying, and he quashed it again. We moved on to other sites with the intention of convincing him later.
Of course, anyone with more than a few months in IT can see where this train wreck was heading. This morning, they suffered a failure on the RAID array, which caused data loss. We are currently rebuilding the array and starting to spin tape to recover the data, but of course his question now is, "Why don't we have redundancy for these critical services?"
I looked it up. I can catch a flight this afternoon and be there in the morning to yell at him in person. I think the cost would be worth it.