Monday, May 24, 2010

Defect Database Bugs Versus Static Source Code Analysis Bugs

Static analysis has been around for quite some time.  Recently they've been pointed to the very practical application of finding bugs in source code.  Many modern tools do a very good job of uncovering very good bugs -- the big value of which that the tools find them early in the development cycle.  Because the bugs are found in code, the reports point specifically to where in the source code the problem manifests.  Easy to find, easy to fix.

But there are some subtle differences that can greatly affect the expectations that users have of the tool and the process workflow that is needed to support these tools well.

A Dose of Reality
Contrary to what every developer desires, static source code analysis tools do not ONLY find critical, crash-causing defects.  Like any other discovery or analysis tool, it's going to find a mix of great stuff, good stuff, so-so stuff and stuff that's plain wrong.  In industry parlance, these would be problems categorized as critical, high, medium, low priorities and "false positives."  In static analysis, there's even an additional state commonly used called "intentional" which is that the tool correctly found something that looks like a problem but that the programmer intentionally coded it that way (usually due to an assumption that the static analysis tool is not privy to - such as assumptions in the environment that the code operates in).  Modern tools have improved their analysis significantly to reduce the lower value stuff and raise the likelihood of uncovering higher value stuff.

That being said, every organization has a different definition of priority but for the vast majority of software development organizations, priorities are mapped to a stage in the process.  Often, priorities such as "critical" and "high" are mapped to specific requirements - such as part of an acceptance criteria for release.  Mediums and lows are relegated to "if you have time, fix it" or "we'll never get to it."  That's fine - rarely does an organization have the resources to address all issues (safety critical code being the notable exception).  Defects reported by static analysis tools should be treated no differently.  Some are important to fix and others are less important to fix.  When we go over static analysis defects with groups of developers, you'll invariably have the whole room resoundingly agree that a particular segment of code is poorly written and "should be rewritten."  However, resource constraints don't always make that the best business decision.  Most developers know that the regular low priority defects in their bug tracking database should be fixed, but will likely never be touched again.  Everyone wants to fix them but nobody has the time.  Lower priority results from static analysis tools fit that too, with the only caveat that the chance for opportunistic fixes is greater because they are discovered during coding.

When introducing static analysis tools to developers, they should be of the mindset that it's not going to find every critical problem and that the problems it does find are not all going to be critical.  These are unrealistic expectations for any tool.

Checks and Balances
Another key difference with static analysis defect handling is that they are being reported by a tool and not by a customer, support or QA.  Thus the decision process for prioritization is not handled by an "outside" party to engineering.  Some organizations let the developers decide on defects which is okay as long as developers are properly trained and all have the same minimum standard for what is acceptable.  Many developers without the right expectations and training end up marking bugs incorrectly as false positives or perform unnatural acts to make the "tool shut up".  To mitigate these problems, organizations can set up a different review/prioritization team or have an audit process on the tail end to ensure accuracy and build learning into the organization.

Making Lemonade Out of Lemons
One might think that a "false positive" or "intentional" is a terrible waste of time and needs to go straight to the wastebucket without further ado.  However, these reports do provide information that can be useful to the organization:
  • False positives show when the tool did not correctly report a problem.  This is often a case where tuning the analysis can improve the results.  By configuring the analysis to understand your code better, you can significantly lower false positives and increase the valid bug count for future analyses.  In this case, some false positives should be assigned to a static analysis expert to retune the analysis to improve the results.  Fixing false positives in this way is significantly more efficient than having developers wade through incorrect results.  False positive "fatigue" is a frequent cause for miscategorization as well.
  • False positives often are a signal of poor coding practices.  If the tool is confused by the code, then it is more likely that the code may be confusing to a new developer inheriting the code.  False positives tend to show around code that could be improved through refactoring.
  • Intentional's are also an interesting area to look.  The most common reason for dismissing a bug report as intentional is that the tool did not have access to an assumption.  Of course, the root of many errors is a changed assumption which causes problems to occur in the future.  Intentional's often signal an opportunity for defensive coding.
Process and Alignment
Static analysis tools often come with their own defect tracking information, which is usually quite useful because they also include a browser which helps developers more quickly debug a report.  And yet, having another database and UI is extra work.

Borrowing as much of the already existing bug workflow is key.  Developers don't want to learn an additional process on top of their already busy day.  Borrow what you can, recognize the differences and institute unobtrusive additional processes to handle them.  Every solution we've instituted follow the same general principles but have varying levels of implementation because everyone's process is different.

No comments:

Post a Comment