Video Summary and Transcription
Mutation testing is a method to improve test quality by inserting bugs into code to test if tests can detect them. Mutation-testing frameworks like Striker.js allow for various mutations to be performed. Mutation testing provides a mutation score that is a better tool than code coverage for measuring test quality. It can help identify missing tests or bugs in existing tests. Stryker is recommended for JavaScript and TypeScript mutation testing.
1. Introduction to Mutation Testing
Hi, I'm Simon, a software engineer at InfoSupport. Let's talk about how mutation testing can improve test quality. Code coverage is a bad metric for measuring test quality. Mutation testing inserts bugs into code to test if tests can detect them. It generates mutants and runs tests to detect failing ones. Hundreds of mutations are performed to generate a comprehensive report. An example of mutating code is changing a greater than equals sign to a less than sign.
Hi, I'm Simon. I'm a software engineer working at InfoSupport, and today I want to talk to you about how mutation testing can help to improve the quality of your tests. But before we dive into mutation testing, I first want to talk a bit about code coverage.
Code coverage is often used by developers as a means to measure the quality of their tests. In my eyes, it's a bad metric to measure this. The only thing code coverage actually tests is whether or not codes get executed. The reason I write unit tests is to verify that my code works right now and to make sure it will keep working in the future. And if something happens and my code behaves differently, I will get a failing test. And again, code coverage does not measure this, it only measures if your code gets executed.
So, how do we actually measure the quality of your tests? Well, that brings me to the topic of this talk, mutation testing. Mutation testing is a way to insert bugs into your source code to actually test if your tests can pick them up. How does this work? Well, a mutation testing framework will start with your source code, which is all just happy and fine and nothing is wrong with it. Then it will make one small mutation for your source code, this will generate a mutant. For example, A plus B could be mutated into A minus B. This will result in a different outcome for your source code. And because your source code has changed, it should cause a failing test. So, the mutation testing framework will run your tests and it can have either outcomes. Either the mutant has been detected because of a failing test, in that case the mutant has been killed, or none of your tests failed and the mutant was able to survive. In mutation testing, we want to have killed mutants. So, we want tests to actually fail because we have inserted a bug. In mutation testing framework, we'll do this hundreds or thousands of times for your source code, and in the end, we'll combine it all together and generate a nice report.
So, how does this mutating work in actual code? Well, here I have a small JavaScript function on the top to check if a customer is allowed to buy alcohol. This customer is allowed to buy alcohol in this country if the age is at least 18. And below I have a test for Professor X, who is age 96 and is thus allowed to buy alcohol. In mutation testing look at the source code above and make a small change. In this case, the greater than equals sign will be changed into a less than sign, thus flipping the check around. If we make this change in our source code ourselves, the test will fail, thus the mutant has been killed. This proves that we have a test for this specific case and this bug cannot be inserted into our code without us knowing it. We can also change it in another way. For example, we can change it into a greater than.
2. Mutation Testing and Frameworks
We started with greater than equals, now it's greater than. Changing the entire return statement to always return 2 also passes. Mutation-testing frameworks allow for various mutations like changing signs, emptying strings and arrays, and flipping operations. Striker.js is recommended for JavaScript and TypeScript, but there are options for other languages as well. A small demo of striker.js is available on our website. Running mutation tests can be slower for large applications due to the number of mutations being tested. The demo application generated 126 mutants from 12 source files.
We started with greater than equals, now it's greater than. Since we only have one test case for someone aged 96, all our tests pass and the mutant has survived. We can also change the entire return statement and just simply always return 2. In our case, it still passes and we have a surviving mutant.
This is very simple code. Most of our code does not look like this and most of our tests are also more complicated. While someone looking at this code might be able to say, hey, obviously you're missing some test cases, in the production code that we're writing, it's often a lot more difficult.
So, what kind of mutations can you expect for mutation-testing frameworks? Well, the definite list depends on the framework and the language that you're using. But in most cases, mutations like this are possible. For example, changing the plus sign into a minus sign, emptying strings, emptying arrays, flipping operations around, you can do a lot of things as a very small change in your code. And only one of these changes will be active at a time to ensure we know which mutation is causing a test to fail.
Well, mutation-testing is available for a lot of different frameworks. And since we're at a JavaScript conference, I would recommend striker.js as the framework to use for JavaScript and TypeScript. But it's available for pretty much every language that you know. If it's not on this list, simply Google it for your language and you will probably find something.
So, for striker.js, I have a small demo. The demo application that we have is also available on our website. The link will be at the end of the slides with information how you can set this up yourself. I can start striker from the command line. And it will start running our tests. Since the demo application is a small application, it's quite fast. If you have a large enterprise application, you will most likely notice that mutation testing is quite a bit slower than running the unit test. Because there are thousands of mutations being made and each and every one of them has to be tested. So, my laptop is starting up right now. It's testing. And in about a second or two, it will actually be done. For our application, it has 12 source files to mutate and it was able to generate 126 mutants. So, it's done. We get a nice table here as an output. But we also have a HTML report.
3. Importance of Mutation Testing
Mutation testing provides a mutation score based on the number of mutants killed. It is a better tool than code coverage for measuring test quality. Looking at specific mutations can help identify missing tests or bugs in existing tests. I recommend trying mutation testing with Stryker for JavaScript and TypeScript.
And if we look at this report, we can see that we have a mutation score. That's the amount of mutants we were able to kill. The more mutants you killed, the higher your score. And just as with code coverage, you probably want a high score.
In this case, we have scored about 85% of our application. Well, if we look at code coverage for this application, we would have 100% code coverage. So, if we only use code coverage as a metric, we would be fooled in thinking we tested this properly.
Now if we look at specific files, you can also look at specific mutations, for example, the H-check here, and we notice that we have an equality operator that survives. To me this indicates that we are missing a test because we were able to change the greater than sign to greater than equals. So, in a report like this for your source code, you can actually figure out what kind of tests you are missing or if you have any bugs in your tests.
So, to summarize, mutation testing in my opinion is a far better tool to measure the quality of your tests. I would highly recommend you to try it out. For JavaScript and TypeScript, I recommend Stryker. There's an example available on our websites.
Comments