Making source code analysis part of the software development process - Embedded.com

Making source code analysis part of the software development process

Modern source code analysis tools (sometimes called static analysis or SCA tools) analyze software programs at the earliest stage of development. SCA tools analyze a program to calculate metrics and find potential flaws and defects in the code.

Unlike tools of the past, which tended to do simple pattern matching, modern SCA tools (Figure 1, below perform sophisticated path and data flow analysis and can find surprisingly meaningful bugs with good accuracy.

Figure 1: Source code analysis detects problems in the earliest part of development

From a business perspective, SCA tools hold a lot of promise. By uncovering problems in the earliest part of the development process, SCA tools can dramatically lower the cost of quality and security for a product.

The effort required to get to value is relatively low. For most organizations, just a few hours of analysis will uncover hundreds to even thousands of potential defects. No testcases are required, and the reported defects literally point to the line or lines of code where a problem can occur.

While SCA tools are an easy sell for most organizations, in practice, users often struggle with the tool. Unrealistic expectations coupled with underinvestment result in failed or suboptimal deployments.

Instituting any toolchain change in an organization requires more than just installing it and sending out the URL to the developers. Real change requires effective planning and execution. Instituting SCA is no different.

Everyone has opinion and experience with SCM tools and bug tracking systems. Much fewer have experience with SCA tools. Because modern SCA tools are relatively new, the industry does not have a plethora of reference architectures by which to fall back upon.

What I hope in this article is to convey some of the hard lessons we've learned by working with a number of companies instituting SCA tools. We hope you don't make these mistakes when instituting change in your environment.

Identify Goals
This is where most organizations are set up for failure. The majority of times, the goal is not even stated or agreed upon. What defines success and what defines failure? With a clear idea in mind all of the process, organizational structure and supporting technical infrastructure can fall into place.

While everyone will agree that SCA tools will “improve quality by finding bugs earlier in the process,” realistic goals need to be instituted to make clear what is expected. Some of the ones we've seen are:

* 100% clean software before release
* No new defects being introduced for each Agile development sprint

One often forgotten mantra is that the value in the tools is not in the number of bugs you find, but in the number (and quality) of the ones you fix. Identifying and addressing the ones worth fixing is the focus.

Identifying Realistic Goals
Most organizations have a significant amount of legacy code. Modern SCA tools do whole program analysis and thus produce hundreds, thousands and even tens of thousands of defects on its first analysis run.

Software development organizations are notoriously resource constrained and thus must put together realistic strategies to address SCA results. Fixing thousands of defects within a 4-week sprint is unrealistic with existing resources.

How can you make life easier while accomplishing the larger goal? A few simple strategies can turn an impossible task into a manageable task:

1) Not all bug categories are created equal. Identify which classes of defects are critical and which are less valuable.

2) While every defect report in every important category should be reviewed, not all have to be fixed. Just as in a bug database, there are acceptance criteria and fixing lower priority items can often be deferred to a later time.

Evangelizing too hard on the quality of results will burn your credibility later down the line. The fact is that intermixed with the good results are a significant number of “never going to happen” or “this isn't really a bug” results. Being transparent with the acceptance criteria is critical to buy-in.

3) Separate the process of dealing with newly introduced defects versus “backlog defects” which have been sitting in the code for a while. Typically reported defects in new code are more problematic than defects found in old code that has already gone through some maturation.

4) Have a separate review team prioritize the defects so developers can focus on only the issues they need to fix rather than on reviewing reams of defects (Figure 2, below ).

Put another way, what would actually stop a release? What are the acceptance criteria that management will support?

Figure 2: Which issues must get addressed before release?

Help the Tool Help You
SCA tools require tuning. Every customer has the same “factory settings” and so a tool must do its best to cover a wide range of codebase types and goals (Figure 3 below .)

For instance, a medical device company may wish to find as many problems as possible within their 8,000 line of code embedded device whereas a 10 million line of code maker of operating systems may only want to find only the surest of bugs.

The “factory settings” represent the best compromise it can do to cover these bases. Expert analysis tuning for coverage and depth can make a huge difference in the quality of the results returned and should be a part of every deployment.

Figure 3: Static analysis must apply to a wide variety of applications, from a few thousand lines of code (KLOC) embedded application to operating systems that are tens of millions of lines of code (MLOC). Tuning must be performed to understand the specific codebase.

Define Process, Integrate and Optimize
Once clear goals have been defined, designing a process to support it becomes easier. Some of the way best ways to integrate the SCA tool into the process have been:

1) Enable developers to run a fast analysis locally so that they can find, fix and re-verify their code prior to check-in. The quality of the checked-in code improves significantly as a result.

2) Institute an SCA run for every continuous integration build. Because SCA tools are typically interprocedural (meaning they perform a whole program analysis), integration type problems can be quickly rooted out and fixed. Integrating with the continuous integration build works well with Agile development environments.

3) Institute an SCA run for every nightly build of nearly every trunk. On the main trunk, issues should be addressed immediately. On a developer branch, it might be a convenience for the team to have a nightly analysis so that they can more easily meet a zero defect static analysis requirement. The code review process might be a convenient location for such activities.

4) Institute an SCA run before every collapse of branch and release. When development branches collapse into integration branches collapse into main, acceptance criteria should be met before it is allowed to merge. SCA issues should be addressed as part of these requirements.

Detailed processes need to be well defined. Throwing the tool over the fence without training and documentation specific to the organization often results in misunderstanding and suboptimal results. Pre-canned queries, a defined and turnkey workflow and transparency and visibility into what is required lets all actors in the process be on the same page.

Setting up the right infrastructure to support these processes can be significant work but it almost always pays off through years of improved productivity.

Integration with the existing toolchain, such as with bug tracking systems and SCM systems work very well to lower the cost of using the tool. Automation features, either built-in or added on can increase efficiency. Developers don't want additional steps impeding their progress on aggressive deadlines. Tools are supposed to make life easier, not harder.

Performance tuning is another area for optimization. SCA tools are fast but a large codebase still takes time. Millions of lines of code can take hours. In some cases, complex codebases can take days. Smart tuning can significantly reduce the time, sometimes by ten times or more.

One size does not fit all. In fact, most software development organizations have unique requirements that require customization.

Don't Sweat the Small Stuff
Change is hard. Effecting change in an organization requires a proper rollout and maintenance plan. Here are the most common issues we've seen at companies worldwide:

* Ownership of SCA tools is often given to the central tools group. They may do a decent job of installing and making it available but don't have the responsibility for the actual usage of the tool. They may perceive their job is done when it is up and running. Somewhere, someone has to care that defects are being addressed. QA, development managers, senior management, product management, architects and even governance teams should have a stake in the tools success.

* Rollouts need to be done well. Showing developers how to look at results and how to triage them properly is critical to success. Developers left to their own devices will often severely underutilize the tool by misinterpreting results or not looking hard enough to track down the real problem. Having group triage sessions led by knowledgeable team members will help significantly gain consistency among the developers. It also helps set the right expectations on what the tool can and can't do.

* Put the right checks and balances in place. Letting each developer examine their own results and prioritize them without review will result in missed bugs. Any organization has a wide range of skills and standards. The best processes have either a review or audit of the results in order to ensure a minimum standard has been met consistently across the user base. Consistency in usage across developers and across groups is important for meeting goals.

* Simplify the process as much as possible. Power user features are wonderful but will confuse the average developer whose core competence is coding, not on source code analysis. Thinking that all developers like to and have the time to tinker and learn another tool is an incorrect assumption, laden with inefficiency and error. We've seen several organizations just receive standard training with no company-specific guidance as to what the goals are and how to accomplish them.

* Rolling out SCA tools to the entire team without first running a pilot. Doing this means increased risk. You only have one chance to make a first impression. Keep in mind that a 2-hour training session with 100 developers represents 5 person-weeks of development time.

Having an un-tuned system that produces inaccurate results or takes too much time to analyze can waste a half an hour every day of a developer's time. Gains you receive from using static analysis may be erased altogether. Developer time is precious and should be allocated wisely. Plan wisely before rolling it out.

It's Time
Infrastructure improvements like instituting SCA, can easily be pushed to the wayside in favor of focusing on market-winning features. But, a healthy balance of short term and long-term improvements will help you win not just a few of the battles, but the war. With SCA, there is an upfront cost to rolling it out successfully. It's not huge but it's also not insignificant.

Andrew Yang is a Managing Director and Co-Founder of Code Integrity Solutions, a consulting firm that specializes in source code analysis and build automation tools. Andy can be reached at andy@codeintegritysolutions.com.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.