Q&A: Improving Software Quality

Moving beyond software testing to understand and measure the structural quality of the entire software system is the key to preventing software glitches.

Horror stories of software glitches abound, but Jitendra Subramanyam, director of product strategy and research at CAST Software says they can be prevented if you take the right steps. We discussed these best practices with Jitendra recently to learn what steps IT must take to improve the quality of its development work.

Enterprise Strategies: What are some recent examples of IT problems caused by poor software quality?

Jitendra Subramanyam: Pointing out IT wrecks has become a cottage industry. I can easily think of a handful of embarrassing IT failures due to poor software quality just in the last month.

I'm sure you've heard of the New Hampshire man who was charged $23 quadrillion (that's more than 2000 times the size of the U.S. national debt) due to what Bank of America called a "programming error."

Then there was a "software glitch" delayed the delivery of 18,000 cases of liquor a day in Washington State -- the glitch took 4 weeks to resolve.

Fianally, if your 4th of July travels took you through O'Hare airport in Chicago, you no doubt experienced the software failure that grounded all United Airlines planes for several hours. The problem affected the boarding pass system, leaving pilots with no information about the number of passengers on board -- the information they required to determine how much fuel they need.

Some of these stories are funny, but they do point to a serious problem: there's software in almost everything you touch, so these kinds of "malfunctions" have the potential to wreak a lot of havoc. These are just a few examples of highly publicized failures of business-critical systems that can severely damage an organization's hard-earned credibility.

What are some of the root causes of poor software quality?

The primary root cause is lack of end-to-end visibility over the entire system. Enterprises don't have this visibility because software systems of any size are multi-tier, multi-platform, multi-language, and multi-sourced. It's like the pieces of the Airbus A380 -- there's lots of them and they all have to fit together and work together once assembled. Not having X-ray vision over the entire structure of the system means you really don't know what you don't know.

Other causes of poor software quality include a lack of knowledge on developers' part about the business domain served by their applications; development schedules that are so tight that they force developers to compromise sound practices; badly engineered software that is hard to modify without introducing mistakes; and acquisition practices that get software from distributed teams but have no real control over the software they're receiving.

How do application developers currently ensure software quality? What's wrong with this approach?

The current approach is through unit testing, integration testing, and performance testing. Enterprises need to do all this, but it's not enough. For example, performance testing only scratches the surface. Here's why:

  • It's usually done too late to be meaningful and is quickly curtailed due to schedule pressure. It's often too costly to replicate a true production environment.
  • Even when plenty of time is available, doing it right is more art than science. What to test, how to replicate the production environment, knowing how much is enough -- these are all difficult issues that require a considerable amount of expertise. It is very difficult to make it repeatable.
  • Even if it is repeatable, performance testing doesn't tell you how something will perform in a few months when production conditions have inevitably changed. It won't tell you how easy it will be to modify the software to meet pressing business needs. It won't tell you how difficult it will be to transfer the maintenance of the software to another team (internal or external).

What best practices can you recommend that will help enterprises fix the root causes and improve software quality?

It comes back to fixing the root causes of poor software quality.

First, get the right visibility into your large-scale systems. Most critical problems hide in cracks between technology tiers and interfaces.

Second, measure the right things. For example, you can measure performance in terms of latency and availability, but how quickly a system can be changed to meet a pressing business need, or how well it resists hackers -- these characteristics are not that easy to quantify but are still relevant to the quality of a system. Quantify them, measure them, and track these indicators of quality.

CAST (the firm I work for) applies these principles to practical situations using our best-practices rules engine and our sophisticated language parsers.

What are the benefits of testing software quality at the source-code level?

The source code is where the buck stops. It's the DNA of the software system. It is the tangible thing which brings to life the abstract idea of an application.

The source code level is the root cause level -- it's the place you have to go to fix the root causes of quality problems once and for all.

When you test the quality of the system, it's very hard to trace problems all the way back to the source code, so most testing problems are fixed by Band-Aids that cure the symptoms but not the causes.

It's the difference between taking an aspirin and having a CAT scan -- an aspirin will stop the pain (the symptom) but not the underlying cause (a pinched nerve).

What does CAST mean by application intelligence? How does it help improve software quality?

Application intelligence comes from being able to define and precisely measure the engineering quality (or as we like to call it, application software quality) of large-scale IT software systems.

Application software quality describes, among other attributes, the soundness of an application's architectural design and the extent to which its implementation follows proven coding best practices. Application software quality is not measured by passing test cases that were mostly designed to verify the functional correctness of an application. Rather, measures of application software quality are measures of the internal structure and engineering of its code.

Through extensive research and industrial experience CAST has identified five areas of application software quality that most affect business outcomes and total cost of ownership(TCO). These five areas, or "health factors," are similar but not identical to the high-level software quality measures defined in ISO 9126. Each of these five areas can be assessed by measuring numerous attributes of the software, then aggregating the results into a summary health factor for that area. These health factors summarize application software quality at a level that can be related to business outcomes and value.

What products and services does CAST provide? How do you differentiate your company from your competitors?

CAST's mission is to enable the world's best enterprises to achieve significantly more business productivity from their complex application software. We do this through the power and automation of our Application Intelligence Platform.

CAST is unique in being able to precisely define and measure application software quality -- key characteristics of an IT software system that have significant impact on IT costs and business productivity. At CAST, we define software quality by determining the degree to which software source code complies with 850 or so best practices in software engineering. These best practices are not just something we made up -- they come from industry standards set by ISO, SEI, MITRE, and a number of other standards groups. We keep track of evolving standards and incorporate the changes into each release.

Out of the box, we cover more than 30 languages and all the major technologies out there. Our deep analysis of multi-tier, multi-language software systems finds problems that can drain business performance and finds problems that are likely to occur as the code base (inevitably) evolves.

In addition to precise measures of software quality, CAST also generates precise measures of software size and complexity. These are used to accurately estimate effort and benchmark productivity.