August 23, 2006

Software sleuthing in the field

Software should make it easier for users to track down the source of bugs and errors

While I was reading Ellen Ullman’s novel The Bug last month, life imitated art. The protagonist in that story is a programmer who grapples with a fiendish bug. It strikes intermittently and, to add insult to injury, the testers can never manage to capture the core dump that might yield the clue as to why.

Meanwhile, in real life, an application that I use routinely to create videos, and that has always worked reliably, began failing mysteriously -- and in my case reproducibly -- whenever I tried to produce QuickTime files. In the grand tradition of cryptic error messages, the only clue was an unhelpful -50.

Because I have a relationship with the vendor, and because I’m a fairly technical guy, we were able to collaborate on solving the problem in a way that wouldn’t work for a typical customer. The lead developer wound up sending me a specially-built version of the program, one in which verbose logging was enabled, and I used that to capture a trace that led him to the solution.

As is so often true, it was a silly little thing. At some point I’d switched from using an absolute path in the Save As dialog box (c:\jon) to a relative path (\jon) that the QuickTime encoder won’t accept. The application should have caught that before calling QuickTime, but didn’t. Now, it does.

Ideally, of course, a unit test would have flushed out this problem before it ever got to me. But things will always slip through the safety net. When they do they’re often fiendishly difficult to diagnose, and it’s worth considering the reasons why.

There are many, but in my post-mortem analysis of this incident I zeroed in on two in particular: provenance and configuration.

By provenance, in this case, I mean the origin of the error code. I was running a Windows application that used a QuickTime component. From which domain did the -50 arise? Windows itself? The application? The QuickTime component? I guessed QuickTime, but searching for “quicktime error -50” yielded no insight.

As I later discovered, -50 is the generic Mac OS “Error in user parameter list”. Connecting that to the relative path in my dialog box would, admittedly, still have been a long shot. But there’s a chance it would have prompted me to reconsider the parameters I was sending to QuickTime.

Applications increasingly rely on components and services that can turn up in unexpected contexts. I wasn’t expecting to see a Mac OS error code on my Windows box, but in a mix-and-match world, that happens. Reporting the provenance of error codes would be a helpful best practice.

Enabling users to visualize configuration change would be even more helpful. The default path remembered in that dialog box is part of the application’s configuration. When the problem arose, I asked myself the obvious question: “What changed?” But there was no way to compare the state of the application before and after.

In principle it’s easy to do this. If applications recorded snapshots whenever their settings were changed -- in plain text, or even better, in XML -- we’d have audit trails that would help solve a lot of these kinds of problems.

The snapshots needn’t be stored locally, by the way. Network storage could be an add-on service opportunity for the vendor, and an interesting way for customers to explore and learn from one another’s patterns of use.

[For more on this topic, visit Jon Udell's blog]

Close

On Twitter now

Application development

Powered by Twitter
additional resources
White Paper - How to Improve Delivery of Advanced Web Applications

White Paper

Virtual Workforce: The Key to Expanding The Business While Cutting Costs

Get the independent advice and expertise you need to support a virtual workforce.

Go inside:
The three-step approach to making a virtual workforce a reality.
The four flavors of client virtualization technologies.
The three key initiatives that solve IT challenges.
Download now »
White Paper: Successfully Secure Your Wireless LAN With Wi-Fi firewalls.

White Paper

Addressing Linux Threats Leveraging Fewer Resources

The increase in Linux popularity has increased the frequency and sophistication of malware attacks. Read this 2 page white paper now to learn how you can protect your Linux environment with real-time protection that is certified by all major Linux vendors.

Download now »
White Paper - The 2009 Handbook of Application Delivery

White Paper

The 2009 Handbook of Application Delivery

Ensuring acceptable application delivery will become even more difficult over the next few years. As a result, IT organizations need to ensure that the approach that they take to resolving the current application delivery challenges can scale to support the emerging challenges. This handbook elaborates on the key tasks associated with planning, optimization, management and control and provides decision criteria to help IT organizations choose appropriate solutions.

Download now »
White Paper - Is Your Backup System Outdated?

White Paper

Mid-range Storage Considerations

A common misconception is that mid-range storage requirements are dramatically different than that of a larger enterprise. Mid-range storage users may require less capacity, but they have similar functionality and management requirements. This ESG paper examines mid-range storage needs and reviews a new solution that adjusts size while retaining value, performance and functionality.

Download now »

Sign up to receive InfoWorld Resource Alerts

Subscribe to the Developer World Newsletter

Receive a weekly roundup about the art and science of software development.

©1994-2010 Infoworld, Inc.