Review of the Mutation Article

An Analysis and Survey of the Development
of Mutation Testing
Yue Jia and Mark Harmon
A Quick Summary
For SWE6673
Introductory Comments
• Mutation testing is a fault-based technique (testing to show existence or
absence of specific faults)of developing “mutants” to be tested by a set of
test cases.
– The type of faults are mostly ---- syntax based faults
• Test case is ran against several mutant programs. The result is kept and a
“Mutation Adequacy Score” is kept:
Mutation Adequacy Score = (# of defect found)/( total # of seeded defects)
= (# of mutants killed)/(# of seeded non-equivalent mutants)
≤ 1
• In a sense, Mutation Test is evaluating how good is the set of test cases
• An area that was started in 1970’s by Lipton, DeMillo & Hamlet
Mutation Testing (Theory & Process)
• 2 Underpinning Hypotheses (assumptions):
Programmers are competent and make simple errors
Errors and faults (defects) are coupled – detection of simple faults lead to
detection of many complex faults
• Mutation Analysis & Execution Process:
For a program P, generate (develop) mutation P’ of P
Run the original program P with test cases T, fixing all the bugs in P.
Run the mutation program P’ with test cases T
• Consider P’ “killed” (mutation detected) if results of running T against P and P’ are different
D. Continue running all the P’s and score the “killed” P’s versus all the
developed P’s ----- this “score” gives us the Mutation Adequacy Score.
• But --- Mutation Testing has Problems/Weaknesses:
– There may be a “high number” of mutants and the cost of running them all.
– High amount of “human effort” required in the analysis of mutants is costly.
Mutation Test Case Problem: Mutants Reduction
• Given a set of Mutants, M and a set of test
cases T
– let MST(M) stand for mutation score of M with T
– Then the problem is to find a subset of mutants
M’ from M where:
MST(M) ≈ MST(M’)
(# of killed M/non-equivalent M)≈ (# of killed M’/non-equivalent M’)
What do you think ---? Is this really the same?
Mutation Test Case Problem: Mutants Reduction
• Many different approaches have been tried to reduce
the number of mutants for testing:
– Random Sampling : found that randomly selecting 10% of
mutants only reduces the effectiveness 16%
– Mutation clustering: grouping mutants into clusters by
traditional clustering techniques such as K-means and selecting
representative mutants from each cluster
– Selective Mutation: reducing the number of mutation operators
to reduce the number of mutants generated without significant
loss of effectiveness.
– High Order Mutation: use higher order mutants (via multiple
applications of mutation operators) to replace number of first
order mutants
Techniques for Reducing Execution (Running)
time and effort
1. Consider Strong, Weak, and Firm ways to analyze
the killing of mutants during the execution
Strong: execute the whole program and if the mutant
results differently, then it is “killed”
Weak: execute up to and include the part that is mutant
and check the result
Firm: execute somewhere between Strong and Weak
Techniques for Reducing Execution (Running)
time and effort (cont.)
2. Use different tools :
interpreter (expensive)
compiler (fast)
Compiler based (compile the whole original program P, but
only compile the mutation part for each program from M
byte code translation to different platforms (for platform
aspect oriented mutation
generate & compile different mutants
3. Use multiple platform such as distributed processors
to simultaneously execute the mutants
Detection of “Equivalent Mutant” Problem
• Detecting that a program P and one of its mutant is
equivalent is theoretically undecidable:
Program P
for (int i=0; i<5; i++)
{ no change to
value of i within loop
“Equivalent” Mutant P’
for (int i=0; i != 5; i++)
{ no change to
value of i within loop
• P’ is a mutant of program P if P’ is syntactically different
from P but is functionally same as P (e.g. produces same
• Mutation Score based on non-equivalent mutants without
complete detection of “equivalent” mutants implies that
we can never get to 100% on mutation score.
How much does this bother you ---?
Mutation Testing May be Applied:
to Various Artifacts
• To Program Source Code in following languages :
FROTRAN (22 mutant operators)
C (77 mutant operators)
JAVA (include OO mutant operators)
C++; SQL; PHP; AOP; COBOL; Spreadsheet Language
• To Specifications in :
Finite State Machine
Star Chart
Petri Net
Web services in XML
Description of runtime environments
Mutation Testing May also be Applied to ---• Constraint-based test data generation (found that 75%
of the mutants can be killed with automatically generated
test data.) [e.g. conditions in which mutants will die are
written as algebraic constraints on test cases and then
generate the test cases]
• Regression testing where we reuse the test data:
– Reschedule the test sequence based on the mutant
killing score
– Minimize the test cases that need to be re-run for
regression based on the mutant killing score of the test
Some Empirical Results
• Most of the earlier work dealt with code size of < 50 loc and
increased in size as non-academic code was used
• Mutation testing developed test cases, when compared
against “all-use” data flow test cases, actually subsumes the all
use-test cases. (16% more defects were found with mutation
generated test cases than all-use data flow)
• “Real world” software errors were compared against mutants,
and found 85% of mutants were also “real world” faults.
• Human errors and mutants are different enough that both
automated and human generated faults are needed for testing
Some Concluding Remarks
• Tools for Mutation Testing have increased in
number since 2000. (30 years after its
inception ---- technology maturation takes
that long)
• Continued Barrier:
– Perception of complexity and high cost
– Decreasing (can not be solved)Equivalent
Mutants problem is gaining some momentum
Does thinking about Mutants provide us insight into defects and help us generate
better test cases? ----- Your thoughts?
Now that you have read about Mutation Testing --consider the following
pseudo – code
Integer m, n, max ;
Input m, n;
If (m ≥ n)
max = m;
max = n;
print max;
Input pairs as test cases
{5,2 }
max = 5
{10, 305} max = 305
{20, 20}
max = 20
{-23, -5}
max = -5
{ 0, -4}
max = 0
• Is this test set pretty good --- what mutant will not be
killed? ----- what type of defect was this?

similar documents