Large-scale software systems are staggeringly complex works of engineering. Bugs inevitably come with the territory and for decades, the software profession has looked for ways to fight them. We may not see perfect source code in our lifetime, but we are seeing much better analysis tools and promising new approaches to remedy the problem.
TDD (test-driven development) is one increasingly popular approach to finding bugs. The overhead can be substantial, however, because the test framework that ensures a program’s correctness may require as many lines of code as the program itself. Run-time checking is another popular approach. By injecting special instrumentation into programs or by intercepting API calls, tools such as IBM’s Rational Purify and Compuware’s BoundsChecker can find problems such as memory corruption, resource leakage, and incorrect use of operating system services. TDD and run-time checking are both useful techniques and are complementary. But ultimately, all errors reside in the program’s source code. Although it’s always important for programmers to review their own code (and one another’s), comprehensive analysis demands automation.
One compelling demonstration of the power of automated source code analysis is Coverity’s Linux bugs database. Viewable online, this April 2004 snapshot pinpointed hundreds of bugs in the Linux 2.6 source code. Coverity’s analyzer, called SWAT (Software Analysis Toolset), grew out of research by Stanford professor Dawson Engler, now on leave as Coverity’s chief scientist.
In the Windows world, a static source code analyzer called PREfast, which has been used internally at Microsoft for years, will be included in Microsoft Visual Studio 2005 Team System. PREfast is a streamlined version of a powerful analyzer called PREfix, a commercial product sold in the late 1990s by a company called Intrinsa. Microsoft acquired Intrinsa in 1999 and brought the technology into its Programmer Productivity Research Center.
Today’s source code analyzers are not new. Their lineage traces back to a tool called lint which, from the early 1970s, enabled programmers to find common errors in C programs. Renewed interest in this venerable art prompts experts to offer several possible explanations.
Brian Chess, chief scientist at Fortify Software, whose analyzer specializes in detection of security vulnerabilities in C, C++, Java, JSP, and PL/SQL source code, thinks security concerns make analysis even more imperative. Many of the errors in C and C++ that are hard for humans to spot, but relatively easy for automated analyzers to find, involve memory management. Before computers were pervasively interconnected, a buffer overflow was an inconvenience but not necessarily a disaster. Now, such errors are routinely exploited by attackers.
However, Engler thinks the security explanation should be taken with a grain of salt. His research in the late 1990s aimed to improve the reliability of software. Security analysis was part of the story, he says, but “basically, we just didn’t want stuff to crash.”