Coverity and Klocwork code analyzers drill deeper

Industry leaders show remarkable scalability and prowess but differ in design philosophy

Remarkable increases in hardware performance are enabling the design and creation of tools that were simply not possible years ago. With two processor cores tearing through 3 billion instructions per second, it's now possible to devise tools that perform rich, very thorough analyses very quickly.

Coverity Prevent and Klocwork K7 are two such tools; they analyze source code for bugs and defects using a variety of techniques, including stepping through all possible execution paths. As a result, they detect infelicities that are not caught in testing, nor necessarily detected in manual code inspections. These products are especially valuable to sites with very large code bases, especially if those applications are important or mission-critical. My review showed Klocwork K7 holding an edge over Coverity Prevent. However, the products are close enough in many respects  that both warrant consideration before any purchase is finalized.

Going to the source

Both tools transcend the traditional concept of static code analysis, so understanding them requires something of a mind shift. The most widely known static code analyzer is lint, the code scanner bundled with Unix for decades. Lint looks for infelicities in C and C++ code and for suspicious constructs. For example, it flags suspicious indentation, possible truncation when a large value is copied to a smaller variable, as well as a host of other possible bugs. Good lint utilities often generate hundreds of warnings, and using lint often means learning how to reduce the number of warnings generated by items of no real concern to you -- aka false positives.

K7 and Prevent do not traffic in this kind of bug sniffing. They relegate those bugs to lint and look for more insidious defects. For example, both specialize in stepping through every possible executable path through the code base. They search for functions that are called incorrectly or with invalid values and then report on the specific path through the code that leads to the undesired result. They also look for other dangerous items, such as references to variables whose memory has been de-allocated by code in other functions, and so on. Essentially, any defects that arise from cross-functional code errors are their stock in trade.

The intended beneficiaries of this analysis are sites with large code bases -- typically 500,000 or more lines of code. With that much code, sites simply cannot use other means to traverse all the code paths to ensure everything lines up as it should. So, having tools that automate this analysis and work backward through hundreds of function calls, if need be, to track down a possibly corrupted value is a valuable resource, especially in handling edge cases that might escape typical functional testing. In preparing this review, I looked at small to midsize code bases -- the largest being 80,000 lines -- and I spoke with customers of both vendors, some of whom used the products on projects that exceeded 20 million lines of code.

Much in common

As they perform similar tasks, both products share many aspects. They are driven by the same makefiles or project configuration files that drive a compiler. They build the code base using the compiler and watch the commands issued to the compiler, log them, and then generate a translated equivalent for their own analyzers. In this way, the analyzer is looking through exactly the same code base and files as the compiler. The tools then read through the code as the compiler would and perform the analysis. In both cases, the principal display mechanism for the results is HTML, which is made available via an embedded Web browser. Both products enable developers to make changes to one or more code files and post these changes to the central defects repository. The analysis engine will then comb through the changes and update the defect list, removing references to bugs that have now been remediated. In this way incremental updates to the defect list are possible.

The display of bugs is highly customizable. Due to the likelihood of many defects, the tools can constrain defect lists via a wide variety of filters. Managers can make comments about individual defects, and these comments will follow the defect anywhere it's displayed.

Both products had extensions oriented toward scanning for security holes. However, these options are new extensions that lack the maturity of existing stand-alone security checkers, such as those from Fortify and Secure Software.

Despite these similarities, the products diverged in significant areas.

Klocwork K7 v.

Klocwork is a company formed 10 years ago by development managers at Nortel Networks to design programming tools that could handle the massive code bases used in telephone switches. Due to the fact that these switches are made up of many millions of lines of code, K7 has robustness and scalability built in. Not only does the analyzer scale easily across oceans of code, but the package contains extensive tools for managing the many results. The central project console has remarkably cogent visual representations of the code base and the exact status of the defect database, as it has changed over time. Extensive drill-down capabilities enable managers to view the entire project on one screen or view the status by project components, modules, files, even down to individual lines of code.

A separate utility presents extraordinary pictorial analysis of the complex relationships between files and functions. This tool is by a wide margin the most impressive code navigation tool I have seen. But beyond the navigational aspects, it can identify odd relationships that would indicate bugs, such as a library of functions making calls to an application -- a definite no-no. This relationship would never be flagged by the analyzer as suspect, because it cannot access this higher-level view.

K7 also has fine reporting capabilities. One click in the management console can generate an extensive PDF file (filters enable managers to include or exclude a wide variety of data), exportable text, or XML files. One option enables defects to be exported to the open source Bugzilla bug-tracking tool.

A key differentiator is that K7 can analyze C, C++, and Java, whereas Coverity's product works only on C and C++. K7 can perform analysis based on Java source code and bytecodes, the latter being Java's form of executable file. If the bytecodes contain debug information, K7 can trace defects back to specific lines of code. If not, it can simply identify that a certain type of bug has been found. This option enables sites that rely on third-party Java components to screen them for possible defects before use and to identify the type of defect to the vendor.

Overall, this is a comprehensive and very impressive package, made available at a remarkably low price.

Coverity Prevent 2.2.2

The Coverity tool emerged from academia; in many ways, it retains the feel of that environment. Whereas Klocwork K7 provides comprehensive analysis tools and a well-designed set of supporting utilities, Coverity Prevent is a pure analyzer with a simple interface. It has no management console. The only way to see what has changed between runs of the analyzer is to run diff -- a programming utility from Unix that identifies what has changed in a source file. Whatever differences it comes up with, that's what's new. Dashboards or other displays of project status are nonexistent.

Because Coverity is limited to C and C++, it has good representation in embedded contexts. As a result, it works on a very wide variety of platforms and with an enormous number of different compilers -- far more than K7.

Coverity's Unix-like aspect is visible in how it does configuration. For example, to limit the number of false positives, it enables you to provide detailed configuration files that are then compiled with the project, or stub functions that redirect Coverity's checkers, or annotations that are placed as comments directly in the source code. (This last option is of doubtful value. Few sites will change large code bases to accommodate a static-analysis tool.)

The Unix scripting approach is also evident in how the code scanner works. And in this respect, the products are distinctly different. In all tests, K7 found more defects than Coverity. Stripping out false positives still left K7 ahead in total bug counts. However, some defects reported by K7 are close in nature to the items lint reports, whereas Coverity kept far away from reporting these issues. If I removed those items from the bug counts, the products had comparable defect counts.

An important question is, Which approach makes more sense? Personally, I think that if a product finds an undeniable bug, it should be reported -- regardless of whether it seems like a bug for lint or not. This is the path that K7 wisely chose. If for some reason you don't want those results, you can filter them out of the report or display. But always keep the option of seeing them. Coverity does not include such defects at all. If you want them found, you must script your own extensions to the analyzer. Coverity provides samples of such scripts, but it does not build them into the product. This approach reflects the Unix orientation, where anything can be done by writing scripts or using little languages. This view seems valid for Unix, but it's hard to accept in an enterprise-level bug-sniffing tool. And certainly at Coverity's price, you should reasonably expect every bug to be reported without writing, testing, and implementing your own extensions.

Going head-to-head

Both packages are large and have many features, so installation and configuration take time. These are both true enterprise tools, so evaluations should be done with deliberation and careful consultation with sales engineers from the respective vendors. Fortunately, trial licenses are available along with considerable assistance in performing evaluations.

Both products are admirably effective detecting hard-to-find bugs, especially cross-functional defects. Their results are comparable and this measure should not serve as the primary basis for comparison. I prefer Klocwork K7 because it is a more complete tool and is less expensive. Not only does K7 cover more languages, but it has a superb console/dashboard for managing analytical runs and their numerous generated results. In addition, I believe Klocwork's approach to bug identification is superior. In counterpoint, Coverity's strengths are its great flexibility and its capability of running on numerous platforms.

InfoWorld Scorecard
Defect discovery (40.0%)
Value (10.0%)
Configurability (15.0%)
Interoperability (10.0%)
Defect management (25.0%)
Overall Score (100%)
Coverity Prevent 2.2.2 8.0 7.0 8.0 8.0 5.0 7.2
Klocwork K7 v. 8.0 9.0 7.0 8.0 9.0 8.2

Copyright © 2006 IDG Communications, Inc.