Part 3: Limitations of mutation testing
In this part I will show an example about some limitations of mutation testing.
Series Overview
- Part 1: Introduction to mutation testing
- Part 2: Setting up Stryker.Net
- Part 3: Limitations of mutation testing
You can find all code examples in a repository on Github.
Example where mutation testing struggles
Mutation testing is unfortunately not a silver bullet to solve all our testing woes. It is a tool which in some cases can help us detect untested functionality, but can also flat out fall on its face.
I will present one such an example with an .Net specific mutator, as it will mutate FirstOrDefault()
to First()
and vice versa.
You can follow along with my example repository starting with commit aa3c0aa.
The premise here is, that we have a scoreboard which tracks the score for each student. When adding a new score, we check if we already have an entry for the student and add the value to the existing score. Should it be the first score, then we create a new entry. If the student has no entry and we try to get his score, we simply return zero.
In our test we check the behaviour with two scores and none.
Execute dotnet stryker --solution-path MutationTesting.sln
and then take a look at the new report.
We see mutant two surviving, our tests do not catch the switch to First()
.
Looking at the code, we find if (!Scores.Any(_ => _.Student.Equals(student)))
a few line before our mutant.
That condition seems familiar, and it is.
It checks with the predicate _ => _.Student.Equals(student)
whether we already have an entry for this student.
We use the exact same predicate for our mutation target FirstOrDefault()
.
Following the branch for when there is no new entry, we see that a new entry is created and then the method is left via return.
This means our call is only called when there is an entry for a student.
Since we always have an entry, the default part is actually useless and we probably should switch to calling First()
We switch the call to First()
and are now at commit c76f6c8.
Call dotnet stryker --solution-path MutationTesting.sln
and check the new report.
Oh no, we still have an surving mutant.
Expanding the mutant two shows, that it now does the reverse and switches First()
to FirstOrDefault()
.
But we just learned, this makes no sense here!
Well, we know that, but as of now Stryker doesn’t.
I’d say using statical analysis this case could be avoided.
If we know the result of a call to Any()
, then we know whether FirstOrDefault
will always return the default or it will alwas return the value (provided nothing changes the collection in the meantime).
Sadly, this is quite a complex feature and not implemented in Stryker.Net. In the future it may catch such issue, but for now we just have to ignore it.
Ignore Mechanisms
Stryker offers a few mechanisms to ignore or rather exclude code parts.
Using the mutate option we can include or exclude folders, files or even only certain spans of text. Globbing and wildcards are supported. Note that by default all files are included and relative path start from the project folder not the root.
Another option is to ignore methods or rather their parameters.
You can either specify just the Methodname
or Type.Methodname
.
Additionally wildcards are supported and you can use ctor
for constructors.
Note that effectively just the expressions between the brackets will be ignored, if they are accessed before or after the method, they still can be mutated.
At last we can also control which mutations will be used. The mutation level specifies which mutations are active and we also can exclude mutation.
We can run the following command to exclude our wrong mutation.
dotnet stryker --solution-path MutationTesting.sln --mutate "['!Scoreboard.cs{867..932}']
Our report has now an mutation score of 100%.
We specify that we want to ignore the characters 867 to 932 in Scoreboard.cs. Which means as soon as someone writes some code before that part or refactors something, it’s gonna break. Personally I’d rather have the option to write an comment on the line before to ignore that mutation
Equivalent Mutants
A common problem in mutation testing are so called equivalent mutants. These are mutants which behave like the original code and thus cannot be detected.
Looking at our Scoreboard
class an example would be switching First()
with Last()
.
Since we only have one entry per student, they would return the same, because that one entry is both the first and last element for that student.
Some equivalent mutants can be avoided by thoughtful mutator design, my example is not an actual mutator, probably for exactly that reason. Others cannot be avoided and will have to be ignored.