What we talk about when we talk about unit testing – Kevlin Henney

When we talk about testing, often the first thing that people think about is finding bugs. Small things like a crashing app, or catastrophic failures:

2015-04-23 11.06.18


2015-04-23 11.07.45

This second one is interesting, as we know exactly what code causes it.

2015-04-23 11.08.02


This is “adatran” according to Kevlin: Ada written by Fortran programmers. The code was written for Arian 4, but executed on Arian 5. There was no threshold check, because exceeding of the value “would never happen” Except on another machine. To make it worse, this code was not even needed, it was dead code.

Obviously, we can battle these problems with testing, but what exactly is testing? The problem with it is that some people think about the act of writing the test, while others are thinking about running the tests, and it the latter there might even be the difference of running automatic tests and having a (human) tester test things.

As Micheal Feathers put it, testing can have a way broader scope than just finding bugs because a test fails. It gets you in the mindset of thinking about the code from an outside perspective.

2015-04-23 11.35.55


You are thinking about your code now as a user, which helps guide API design and higher level structure. Unfortunately, this effect is hard to quantify, and finding bugs with tests will help you make the case of testing within your company. Look, this test has found a bug. This is sort of a guarantee that testing works and could help you get more time for it. This same effect happens on a personal level, where your motivation to write tests will drop if all of them fail, and you might not even notice this.

It takes GUTs

Many people say they use TDD, but what they very often mean is just having good unit tests (GUTs) What contributes to good tests? One of the things is good organization and good names for tests. So, there will often be many many test cases for one function. Names like Year_divisible_by_4_should_be_leap_year are fine if they get you in the right mind set of stepping away from a one to one relationship between function and test, that is fine, but Kevlin suggests an alternative:

2015-04-23 11.59.31

If your test framework supports anything like nesting, use it! Nest your test cases and give the higher level concepts good names. In a sense, you should factor out commonalities with tests, much like you do with regular code.

2015-04-23 12.03.53


See how this better structure also reduces the names of tests, and makes the test almost read like the spec. Kevlin showed a few other great examples of how to name and organize tests. More can be found in his book 97 things every programmer should know, where Gerard Meszaros writes that tests should act as documentation for the code they are testing.

Some people strongly oppose the idea that a test is allowed to have multiple assertions, but Kevlin does not agree. What is important here that you test one logical concept per test. For example, you put in one item into an empty list. You want to test that the list has only that thin in it. This could be expressed in many assertions: the list has one item, the list contains exactly what we put it, and no exception is thrown in the process. Think of it like the single responsibility principle. Like other pieces of code, a test should not change for more than one reason.

So what is still a ‘unit test’. There are multiple definitions, but Kevlin defines it as that its passing or failures should totally depend on the test and the code under test. So accessing the file system or the network are not allowed in unit tests, because failure can be due to the outside world. There is an interesting category of tests that are unit testable in theory, but not in practice is not. If this is the case (like, you want to test addition of two numbers, but you can’t because for that your code needs to access the database) that is probably a smell, a sign that some things are too tightly coupled.