Wednesday, September 9, 2009

Are Static Source Code Analysis Solutions One Size Fit All?

Automated source code analysis solutions worth their salt are complex products that do heavy duty analysis on complex codebases. They churn through thousands to millions of lines of code written by teams of developers each with their own skill levels and style. Millions of paths are typically analyzed in a single run producing what it thinks are valid defects to be examined.

On one level, you'll hear people say that code is the same whether it is written in Java, C++, C or C#. After all, some universities give computer science homework assignments that can be written in your language of choice. On the other hand, there is a high degree of variability in the customer base that a static analysis vendor must provide. There are just some of the significant differences:
  • Industry - some industries care about different things. A router company is going to care more about memory leaks because their code may be in operation for months to years at a time. A desktop application may not care at all about memory leaks because they know their application will be open and closed frequently. You can imagine that a military aerospace or medical devices company would have vastly different priorities than a SaaS product where the system can be controlled much more easily. Some organizations may want to squeeze every last possible defect the tool can find at the cost of getting some additional noise. Others may be looking for a good "bang for the buck" and fix bugs opportunistically without having to wade through false positives.
  • Development environment - the software development tools space is highly fragmented. There are hundreds of vendors that provide source control systems, compilers, bug tracking systems, build and continuous integration tools, automated test tools, etc. On top of that, there may be third party libraries in use and/or code generation tools that supplement the hand-written codebase. Software applications are almost always built in a custom way and it's hard for a third party tool to get an accurate picture of what the shipped code actually is without a significant amount of tweaking and integration. A static tool may simply not understand a custom memory allocator or a source/sink for a data flow analysis. They may also have some compiler incompatibility issues. What can happen is that you don't get 100% coverage. Tools will typically "fail" silently or make it nonobvious how to find out what your true coverage is.
  • Development process - similar to the diversity of the software development infrastructure, software development organizations are often very chaotic. Some organizations follow the traditional waterfall methodology and others follow Agile processes. These requirements place demands on how a source code analysis is deployed and used. Some organizations are global with offshore development teams making significant contributions.
  • Platforms - software development organizations typically commit to a single development platform (usually a specific Unix flavor, Linux, Windows, Mac OS X, etc.). However, they may be building software that will run on many different target platforms. Each of these applications use different libraries and may have significant functionality differences.
What's a static analysis tool to do to satisfy everybody? Should the default settings be set conservatively or aggressively? Are the settings designed to give a good demo/trial or are they set for a specific industry? The clear answer is that no tool comes out of the box optimized for your specific codebase. There is too much variability.

What's important to do is to build or obtain the expertise to optimize the tool for your codebase. After all, you are paying the license fee for the full benefit of the tool. If you don't take the extra effort to make sure you are optimized, then you are overpaying for the tool.

We've worked with many organizations who have taken different approaches. Some of built their expertise in-house with just the standard training as their starting point. This can take years to obtain expertise, however, the competency remains in-house. Others take a hybrid approach and allocate the right resources and get an expert to mentor them through the process. Others use consultants to get them all set up and running and then use their in-house staff to manage and administrate it. Others outsource everything related to the tool to an outside expert so that they can focus on their core competencies.

No comments:

Post a Comment