Ops to dev: It's your fault, and here's proof

Operations is tired of fixing failures, and development denies it has a quality control issue -- but the numbers don't lie

1 2 Page 2
Page 2 of 2

Fortunately, it was easy to get the numbers: Back then, a change request had to be submitted on a three-part carbon-paper form and filed. One change request could be as trivial as a small tweak in batch JCL or as large as dozens of program updates, but every single one had to be documented.

My boss hit upon a very simple and effective metric that he had added to the bottom of this daily report -- with no advance warning. It reported, for the prior week, three numbers:

  1. Number of change requests submitted by development
  2. Number of batch job failures
  3. Change-to-failure ratio (percentage based on the prior two numbers)

The metric was crude, but very effective. When initially published, the change-to-failure ratio was over 25 percent. Without having to explain anything to anyone, this said one out of four changes submitted by the development teams failed.

The development team managers were furious! They sputtered to explain that operations could not say a specific change caused a specific batch job failure. My boss patiently explained he wasn't saying that -- he was just publishing hard numbers, which spoke for themselves. My boss's argument: "If nothing is changed, we have a lot fewer failures, so overall we know that changes cause failures."

Despite the complaints from development, my boss's boss (the data center manager) refused to drop that metric from the daily report. Development's management realized they needed to get serious about quality control.

To their credit, in a bit over three months the change-to-failure ratio dropped to about 10 percent -- exactly the type of improvement my boss had been hoping for. As a result, the people on our team had extra time to work on more fun projects than fixing ABENDs.

Quality control was important to development's managers up to a point, but there was always pressure to move on to the next project or change request from the business. What my boss did caused them to realize that their lack of more thorough quality control was creating problems that could be easily avoided. When we had fewer batch job ABENDs, we had fewer instances of not having the DDA system miss its service-level agreement up-time.

This, in turn, made for much happier tellers, not to mention more satisfied customers and executives.

Send your own crazy-but-true tale of managing IT, personal bloopers, supporting users, or dealing with bureaucratic nonsense to offtherecord@infoworld.com. If we publish it, you'll receive a $50 American Express gift cheque.

This story, "Ops to dev: It's your fault, and here's proof," was originally published at InfoWorld.com. Read more crazy-but-true stories in the anonymous Off the Record blog at InfoWorld.com. For the latest business technology news, follow InfoWorld.com on Twitter.

1 2 Page 2
Page 2 of 2