March 29, 2009

Guess who finally quit smoking

An IT team saves the day when a sales department's key server calls it quits

The motherboard had failed in a glorious fashion -- all of its smoke had escaped, leaving a hole where circuit board had once existed. The disk array was moved over to another system to check the state of the database. Apparently, as the CPU was going through its death-throws, it had just enough energy to reach out and corrupt the database.

Bummer.

OK, just pull from the last backup. But this was basically a sales organization. And that problem with daily backups taking over 24 hours to complete? It turned out that the solution sales devised was to stop doing backups. The last full backup the business had was from six months earlier. Fortunately, IT had taken a full backup at the end of April. The phone servers still had their data, so the month of May was still available. But we had to act fast in order to ensure that no data was lost. Additionally, the sales organization was running blind without their operational reporting.

The following had to be completed within 48 hours:

  • set up the server hardware
  • load the operating system, database software, and security apps
  • restore the database from the April backup
  • copy data from the call center servers to the new server
  • update firewall settings so the proper people, applications, and servers could connect (the IT datacenter was on a different domain)
  • update the connecting systems and applications to use the new server

Given the sales group's track record of poor tech decisions and blundered execution, things did not look promising.

Have you ever heard a piano played by a young student, and then played by the master teacher? Or seen shop tools handled by a middle school student, and then handled by a master craftsman? What a difference the same tools make in the proper hands.

Our IT group had the master craftsmen needed for the task at hand. This group completed the system resurrection -- and completed it 12 hours ahead of schedule! It boiled down to their preparation, experience, communication, and professionalism.

Preparation: Fortunately we had an IT group that had been regularly migrating servers for the past six months and had the process down to a science. They knew how to quickly and efficiently move data, check data integrity, verify permissions, resolve firewall settings, etc.

Experience: They had experience in what areas posed the greatest risks and had developed methods and procedures to handle those areas. When permissions and firewalls issues were discovered, they were typically resolved in under 5 minutes.

Communication: During the conversion period there were checkpoint conference calls to see if we were on schedule and ready to proceed to the next step. Users were informed and involved to verify system functionality as early in the process as possible. It was these checkpoint calls that helped us move ahead of schedule.

Professionalism: The IT group had done this so many times that they could remain calm in a situation that others considered a crisis. This set the tone to help everyone work the problem and not panic.

To this day I am still amazed and impressed with the IT group's handling of our tech emergency.

White Paper

D2D Virtual Tape Library Replication Primer

This whitepaper explains the terminology and concepts behind Data Replication technologies and establishes some sizing rules through worked examples. Learn the new paradigm in disaster tolerance—protect data anywhere.

Download now »

White Paper

An Alternative to Virtualization for Datacenter Cost Savings

Server virtualization is a popular option for dealing with mounting datacenter costs. Another equally promising approach is the use of an Application Delivery Controller. Citrix NetScaler provides a low-cost way for organizations to reduce their server count and accrue cost savings from a reduction in space, cooling, power and personnel.

Download now »

White Paper

Why Your Firewall, VPN, and IEEE 802.11i Aren't Enough to Protect Your Network

The emergence of WLANs has created a new breed of security threats to enterprise networks.

Included in HP ProCurve WLAN solutions is security technology that alleviates threats from WLANs through:
* Monitoring wireless activity inside and out of the enterprise
* Classifying WLAN transmissions into harmful and harmless
* Preventing transmissions that pose a security threat to the enterprise network
* Locating participating devices for physical remediation

Download now »

White Paper

Bringing the Edge to the Data Center

Effectively address data protection challenges, implementing solutions that help store and protect business–critical data while cutting costs and improving efficiency and reliability.

Download now »

Sign up to receive InfoWorld Resource Alerts

Subscribe to the Off the Record Newsletter

The one-stop resource center for IT professionals.

©1994-2009 Infoworld, Inc.