The futility of developer productivity metrics

Code analysis and similar metrics provide little insight into what really makes an effective software development team

Good programmers are hard to find. Worse, qualified candidates can be expensive. Little wonder, then, that software project managers want to be sure they're getting their money's worth. They want metrics to demonstrate how each developer's output compares to that of the others. But while this sounds good in theory, is it really possible to quantitatively measure something as arbitrary as developer productivity?

IBM thinks it is. Earlier this month, Pat Howard, vice president and cloud leader for IBM Global Services, explained how Big Blue had developed a scorecard system that awards points to developers based on a number of quantitative performance metrics. The developers with the highest scores, he says, gain the best reputations around the company. The trouble is, such ratings systems are seldom truly legitimate.

[ Relying on code metrics isn't the only mistake hiring managers make: Neil McAllister reveals the ugly truth begind programmer hiring quizzes. | Speaking of quizzes, see if you can pass InfoWorld's programming IQ test, round 1, and programming IQ test, round 2. | Get software development news and insights from InfoWorld's Developer World newsletter. ]

The oldest and most obvious metric for software development is to count lines of code: How many lines has an individual developer or team produced, and how much time did it take? In the PBS documentary "Triumph of the Nerds," Microsoft CEO Steve Ballmer observed that in the 1980s IBM seemed to have made "a religion" out of this metric.

But lines of code is also the metric that's easiest to debunk. In virtually any programming language, it's possible to write the same algorithm a number of different ways, using various syntactic constructs. Some methods will inevitably be more compact than others. On the other hand, some languages require more bookkeeping and boilerplate code than others -- lines that are essentially wasted space. Most important of all, the length of a program's source code tells you virtually nothing about its quality.

Fortunately, IBM's current metrics are somewhat more sophisticated. Big Blue uses a source code analysis tool from Cast to compare code produced by IBM developers with known industry best practices. The code is rated for performance, security, and complexity, and the developers whose code rates the highest receive the highest scores.

It certainly sounds empirical. And yet, no matter what methods you use to evaluate programmers' code, you're still missing the broader picture of what real-world developers actually spend their time doing.

Development by the numbers
Code metrics are fine if all you care about is raw code production. But what happens to all that code once it's written? Do you just ship it and move on? Hardly -- in fact, many developers spend far more of their time maintaining code than adding to it. Do your metrics take into account time spent refactoring or documenting existing code? Is it even possible to devise metrics for these activities? (Counting lines of documentation makes even less sense than counting lines of code.)

1 2 Page 1
Page 1 of 2