September / October 2003 Feature Article

NIST Pickings

Introduction

This article introduces and discusses the *NIST May 2002 report, "The Economic Impacts of Inadequate Infrastructure for Software Testing".

A report prepared for the National Institute of Standards and Technology in the US Department of Commerce has established an economic model. This was used to estimate that the costs of having an inadequate infrastructure for software testing in the United States are between USD 22.2 billion and USD 59.5 billion. This works out to approximately 0.2 to 0.6 percent of the USD 10 trillion Gross Domestic Product (GDP).

According to the NIST report, about sixty percent of these costs are borne by users in the form of error avoidance and mitigation activities. Software suppliers bear the remaining forty percent of costs in the form of additional testing resources consumed due to inadequate testing tools and methods (inadequate infrastructure).

The NIST report has come under a good degree of criticism from well-known figures in the United States testing fraternity, who point out several weaknesses with the report's survey processes and several dangerous assumptions, which the report makes.

The Report Table of Contents

Executive Summary

  • Introduction to Software Quality and Testing
  • Software Testing Methods and Tools
  • Inadequate Infrastructure for Software Testing: Overview and Conceptual Model
  • Taxonomy for Software Testing Costs
  • Measuring the Economic Impacts of an Inadequate Infrastructure for Software Testing
  • Transportation Manufacturing Sector
  • Financial Services Sector
  • National Impact Estimates
References
Appendices

The full report with appendices is quite lengthy at 279 pages, and can be downloaded at http://www.nist.gov/director/prog-ofc/report02-3.pdf.

Before I discuss some of the criticisms, I will highlight some important aspects that the report raises, and then introduce the basic arguments and logic of the report.

Important Aspects of the NIST Report

The first point is to realise that a department of commerce in a major country (the USA) has sufficient awareness about the strategic importance of software testing to request, or allow a request to be answered, concerning the current status of software testing in the country and how it might be improved.

The report states in its executive summary that, "Software has become an intrinsic part of business over the last decade. Virtually every business in the U. S. in every sector depends on it to aid in the development, production, marketing, and support of its products and services". It goes on to mention that, "In 2000, total sales of software reached approximately $180 billion", and that, "Rapid growth has created a significant and high-paid workforce, with 697,000 employed as software engineers and an additional 585,000 as computer programmers".

I've often verbalised the above points by saying that almost every business nowadays depends on innovation and service delivery for success. Service delivery is strongly related to IT capacity and success. IT capacity and success are ensured by the implementation of good quality and testing principles.

The NIST report goes on to state that, "Reducing the cost of software development and improving software quality are important objectives of the U. S. software industry".

These objectives, I dare say, would be as important to the South African software industry, and by extension to all South African business and the Government.

The report states what we also know and experience locally, that, "Software non-performance and failure are expensive. The media is full of reports of the catastrophic impact of software failure". Our own Business Day newspaper often runs reports of financial transaction failures, security breaches, performance problems, and operational errors relating to software.

Further nuggets from the report are indicated by software suppliers comments that follow

  • "…software testing is still more of an art than a science…" (my comment: Ouch!).
  • " ...(we) spend significant resources responding to software errors (mitigation costs) and lowering the probability and potential impact of software errors (avoidance cost)."
  • "All developers of financial services software agreed that an improved system of testing was needed."

The NIST report in addition highlights the fact that costs can be apportioned to increased supply costs as well as user absorbed costs such as loss of productivity because of poor or failed software.

Basic Arguments and Logic of the Report

Section 1 of the NIST report introduces software quality and testing principles, and suggests that improved testing and measurement can reduce the costs of developing software of a given quality. The report suggests that developing standard testing tools and metrics could go a long way toward addressing some of the testing problems that plague the industry.

Testing is defined as, "the dynamic execution of software and the comparison of the results of that execution against a set of pre-determined criteria."

The impact of inadequate testing is grouped into four general categories:

  1. Increased failures due to poor quality,
  2. Increased software development costs,
  3. Increased time to market due to inefficient testing, and
  4. Increased market transaction costs.

The increased cost of repairing errors as they are fixed later in the software development life cycle is presented.

Section 2 of the NIST report discusses software methods and tools. Tools support test phases or stages, and foundational to the whole testing effort (stages and tools) is a set of standardised software testing technologies (or a preview of the intended infrastructure to improve current testing).

Section 3 of the report develops a conceptual mathematical model for estimating the economic impact of improved software quality represented by reduced error numbers compared to current error numbers.

Section 4 of the report presents taxonomy for software testing costs.

Software suppliers (developers) have the following costs:

Pre-release Costs

  • Pre-release labour costs to support testing
  • Hardware costs that support testing
  • Software costs that support testing
  • External testing (by external testing consultants and companies) costs

Post-release costs

  • After-sales service costs

Software users have the following costs:

Pre-purchase costs

  • Purchase decision labour costs and delayed decision costs
  • Delayed adoption costs

Installation costs associated with bugs

  • Labour hours of company employees
  • Labour costs of consultants
  • Lost sales, company downtime due to extended installation

Post-purchase costs associated with bugs

Product failure and repair costs

  • Labour costs - Product failure and repair costs of own employees
  • Capital costs - Scrapping of ineffective systems
  • Consultants - Repairing of data archives
  • Sales forfeited - Company downtime attributable to lost data

Inability to fully accomplish tasks

  • Labour costs - of employees to implement "second best" practices
  • Sales forfeited - due to "second best" operating practices

Redundant systems

  • Hardware costs - Multiple hardware systems maintained in case of system failure
  • Software costs - Licensing or updating old software after a shift to new software system
  • Labour costs - Time of employees or contractors to maintain a redundant hardware or software system

Section 5 of the report assumes that the total number of errors introduced into the software process will be constant, regardless of the types of tools available for software testing. The number of errors found at various software development life cycle phases are considered together with the cost of their repair at detection phase are discussed. The better testing infrastructure is then assumed to find more errors earlier, and also to reduce to cost of repair at any given life-cycle phase.

This modelling guided the specific survey question designs that were then used in section 6 and section 7.

Section 6 of the report calculated absolute and feasible savings from a better testing infrastructure based on respondents in the transportation manufacturing sector reporting current statuses, and estimates from likely improvements (reduction of errors) of a better testing infrastructure.

Section 7 of the report calculated absolute and feasible savings from a better testing infrastructure based on respondents in the financial services sector reporting current statuses, and estimates from likely improvements (reduction of errors) of a better testing infrastructure.

Section 8 of the NIST report finally extrapolated the findings from the two sectors to the commercial economy of the whole country (excluding the government and family sectors)

Some of the Criticisms of the Report

James Bach criticises the report for relying on a fundamentally flawed survey, and questions the using of respondents' largely qualitative answers for later quantitative results. Admittedly the report writers do suggest the results are but estimates.

My initial reading of the report led me to believe that the results could be out on the low side by a multiple of about 5 from what could be the real saving, if known testing and quality techniques were rigorously applied throughout the software industry.

Brett Pettichord questions the implied argument that better testing tools will produce such strong results. He suggests that better training, better languages, better development practices or, alternative testing processes than those mentioned in the report might be as relevant to quality improvements.

Cem Kaner has various concerns but also suggests that testing alone is not the sole producer of quality as measured by bug reduction.

He states that, "the reason that Juran's distinction is important is this. If I gave a project manager an extra $100,000 and said, 'Go forth and improve the quality of the product', I suspect that we might get a distribution of expenditure as follows:

  • $20,000 on more design reviews and inspections
  • $40,000 on more features (satisfiers)
  • $5,000 on better documentation
  • $2,500 on developing better processes
  • $10,000 on management overhead
  • $5,000 on equipment to support the expansion
  • $7,500 on better tech support
  • $10,000 on more testing

The percentages are totally fictitious but you get the idea. Each of those expenditures might in fact improve the quality of the product. The best allocation of money will depend on lots of project-specific details, but it is unlikely to go 100% or even 50% toward more testing".

Other serious testing practitioners have been drawn into strong debate about the NIST report. I think this is good, and like I've heard it said of public relations, "any news is good news, as long as we're in the media".

The critiques are good, and generally these testers have had good and bad to report about NIST. I believe the debate will educate and inform the non-testing public as to some of the real issues in testing and software quality.

Some of my criticisms of the NIST report

"Most bugs are introduced at the unit stage (Section 4, Page 4-3, paragraph 4.1.3)". Capers Jones suggests that most defects, 60 percent or more, are introduced at the requirements stage. I believe Capers Jones, and have seen many errors of fact caught by highlighting peoples thinking in writing that is then subjected to formalised software inspections (not mentioned in the NIST report).

Standardised software-testing technologies; such as standard reference data, reference implementations, test procedures, metrics, measures, test scripts, and test cases (both manual and automated); are seen as the only, or the most significant means of error reduction and thus quality improvement. If this were so easy, I expect that most tool vendors would have made breakthroughs by now. Despite huge research and development investments, and incentives to improve tools, we still "have what we have". I believe improved tools will come, but I'm not holding my breath in the next three years.

The NIST user survey results contradict credible studies, some of which are quoted in the NIST report. The cost escalations of defect repair ratios with lateness of stage of the development cycle in which they are repaired are inconsistent with user responses. The user responses to estimates of hours of repair time (pages 6-10, 6-11, 7-12, 7-13) contradict (underestimate) the measured studies of Boehm (1976) and Bazuik (1995) as reported on page 1-13 of the NIST report.

The distribution of bugs based on introduction point in the financial services sector attributes only 30 percent of bugs introduced in the requirements phase (page 7-11). This is lower than Capers Jones average statistics that suggest 60 percent or more bugs are introduced in the requirements phase. It is also at variance with my experience of the South African financial services sector. A very low 15.6 percent of bugs are attributed to the requirements phase in the transportation-manufacturing sector (page 6-10).

A Cigarette Box Calculation of My Own

Without the benefit of hours of research, I would like to table my own perception of what software quality improvement savings could do for the U.S software industry and percentage GDP, and also for South Africa by implication.

In 2000 the USA had 1,282,000 software engineers and programmers (NIST report).

Their hourly salaries were $65.50 and I assume they each worked 1800 hours (one conservative person-year). Raytheon (a U. S. development supplier with about 1500 IT staff in 1990) has reported savings of 35% on software rework (much of this rework is internal) after applying formalised software inspections as a key technique amongst other quality interventions. I vaguely recall that their return on investment in one year was about seven times their quality cost investment.

This gives me a rough maximum potential cost saving from similar quality improvements of

- 1,282,000 * USD 65.50 * 1800 * 0.35 * 6/7 = USD45,344,340,000

That is about USD 45 billion.

If we now add the estimated user savings from maximum reduction of errors (NIST report), we get:

- Approximately USD 45 billion + 35 billion = USD 80 billion

This is the savings potential in the U. S. for applying known scientific testing techniques. These techniques do not seem common amongst NIST respondents who suggested testing is still largely an art.

This exceeds the NIST feasibility estimate of USD22.2 billion by a multiple of 3.6, and this with the current existing testing infrastructure.

This also represents 0.8% of the U.S. GDP, and I'm sure in South Africa we have similar room for improvement.

A Challenge to Our Software Industry and Government and You

Let us apply formalised software inspections, better testing techniques, existing (and improving) testing infrastructure, better test training, better testing standards, better development techniques, and the whole concert of forces necessary (and possible) to pick up our GDP here in South Africa.


Wayne Mallinson
waynem@testdata.co.za

 

<< July / August 2003
November / December 2003 >>