7 March 2012 -- CSSE keynote_SEAN

Report
Understanding
Frequent Root Causes of
System-development Failure
7 March 2012
Neil Siegel
Vice-President & Chief Engineer
Failure is not Uncommon
• The record indicates that the development of large-scale
systems remains an endeavor that often fails.
– Requiring significantly more money &/or time to complete than originally
planned
– Under-delivery of specified functionality
– Lack of suitability of the delivered system for the actual intended use
– Cancellation of the development project before a useful product has been
delivered
• For example, (Glass 2001) cites data indicating that only
about 16% of system development projects that he
examined were listed as successful by their own
developers.
• Analyses of root causes* tend to focus on factors such as
incomplete requirements, changing requirements, and so
forth.
– These are sometimes symptoms, and not causes.
– I offer four candidate root causes, and discuss how to address each.
2
* For example, (Boehm 5-2006)
Four Root Causes for Failures
• “More Precision than Accuracy”
• “Effective but not Suitable”
• 90-90 Failures
• Too Late / Too Expensive to be
useful
3
More Precision than Accuracy
• We may have a great system specification, but the “wicked”
nature of the problem prevents us from actually achieving
consensus on what they system needs to do, even if we
think we have already done so.
–
–
–
–
Ill-defined
Involve many stakeholders with strong and opposing views.
Have conditions that change midstream.
Are misunderstood until a solution is in hand.*
• In many large-scale endeavors, the social factors must be
addresses in synchronicity with the technical problems.
– So our specification – and contract, and statement-of-work, and design
baseline,
etc – are likely of little real value in reaching a satisfactory conclusion to
the project.
4
* Quoted from Steve Nixon, “Wicked Problems, November 2011. Used with permission.
Recognizing Wicked Problems
Every time we discuss it with
the users, we get important
new insights about what
the problem actually is that
we are trying to solve.
We don’t seem actually
to know who are all of the
stakeholders – we keep
finding new ones.
5
The problem seems
actually to change.
We can’t get the
stakeholders to agree.
“Everything should talk to
everything” – we can’t seem
to bound the problem.
Adapted from Steve Nixon, “Wicked Problems, November 2011. Used with permission.
Solving Wicked Problems
Collaboration
Experimentation
Social complexity from integrated networks is a key driver.
Traditional linear solution styles are not well-suited.
6
Adapted from Steve Nixon, “Wicked Problems, November 2011. Used with permission.
“Effective but not Suitable”
• 95%+ of our specifications describe desired functionality,
but experience suggests:
– That while the resulting systems may be effective (in the sense that
they provide the specified functionality), they are not suitable (in the
sense that they fail to operate appropriately within the intended
environment, falling short in areas such as reliability, response times,
ease-of-use, being excessively prone to configuration-driven errors,
and so forth).
– There are many systems that are considered failures ... even after
being shown to meet their specification!
• What to do:
– Achieve far higher reliability in software-based systems.
– Design to stay within the capability and interest-level of the intended
user.
– etc.
7
“90-90” Failures
• Example scenario:
– We have decomposed our system into a set of small
components,
each of which has been implemented.
– When we start putting the system together, however, all sorts
of failures and difficulties arise, performance is unacceptable,
and the schedule and cost estimates are repeatedly
exceeded.
• The problem is often unplanned dynamic
behavior.
• What can we do better:
– “Design for integration”
8
Too Late / Too Expensive to be Useful
• Example scenario:
– The amount of time (or money, or both) required to build
the capability makes it no longer of interest.
– Due to repeated breaches of cost and schedule
estimates, the development team has lost credibility with
the funders &/or users.
• What can we do:
– Agile methods
– Radical reduction in SLOC counts
9
Summary
• Cost increases of 2x, 3x, even 10x are signals of
something other than “requirements creep”
– Attributing failure to “lack of complete requirements” could be
interpreted as passing the blame to someone else
– I believe that we in the development community need to take more
responsibility for achieving more consistently-better performance
• How:
– Recognize the social aspect of our job, and thereby, deal with the
“wicked” aspects of systems development
– Recognize that we have to deliver systems that are suitable, as
well as effective
– Deal better with projected dynamic behavior in our designs, and
thereby avoiding “90-90” failures
– Create methods that will allow us to deliver system within budgets
and schedules that are of interest
10
Q&A
Questions?
11
NORTHROP GRUMMAN PRIVATE / PROPRIETARY LEVEL 1

similar documents