False positives vs. flaky tests · Vladimir Khorikov

Another great comment about false positives that I wish I clarified when I was writing the Unit Testing book. I even remember having this exact thought that the clarification is probably needed here, but it slipped through the cracks.

The comment is about this phrase I used when describing end-to-end tests and their relation to false positives (as you might remember, a false positive is a false alarm, aka false failure):

End-to-end tests are immune to false positives.

Andrew, the commenter, writes:

I feel the opposite is true. End-to-end tests have unreliable external dependencies, involve test code with higher cognitive complexity, and are verified more unreliably.

What Andrew describes is usually referred to as flaky tests (or test flakiness) — when a test fails due to unstable out-of-process dependencies. This could be considered a false positive too — after all, these are also false failures.

In the book, though, I separate false positive by their source of origin.

If a test fails due to external reasons (such as Internet connection, the DB residing in an invalid state, etc) — these issues fall into the 4th component of a good test, which is increased maintenance costs.

If a test fails after a refactoring, that means the test has poor resistance to refactoring (2nd component).

Here are all 4 components:

Protection against regressions — Lack of false negatives, aka lack of false passes, aka lack of bugs
Resistance to refactoring — Lack of false positives, aka lack of false failures
Fast feedback
Maintainability

The notion of false positives is directly related to the 2nd attribute of a good test:

Resistance to refactoring is the degree to which a test can sustain a refactoring of the underlying application code without turning red (failing).

On the other hand, flaky tests are about maintainability, which is a combination of the following 2 sub-components:

How hard it is to understand the test — The fewer lines of code in the test, the more readable the test is.
How hard it is to run the test — If the test works with out-of-process dependencies, you have to spend time keeping those dependencies operational: reboot the database server, resolve network connectivity issues, and so on.

And so when the book mentions a false positive, it specifically means a test failure that is caused by a prior refactoring.

Hence the phrasing "End-to-end tests are also immune to false positives". End-to-end tests are immune to false positives caused by a refactoring.

I didn’t talk much about flaky tests in the book. I think the sources of those are understood quite well at this point and don’t require deep discussion.

In any case, this is a great comment and something I should have clarified in that chapter.

--Vlad

https://enterprisecraftsmanship.com/

Enjoy this message? Here are more things you might like:

Workshops — I offer a 2-day workshop for organizations on Domain-Driven Design and Unit Testing. Reply to this email to discuss.

Unit Testing Principles, Patterns and Practices — A book for people who already have some experience with unit testing and want to bring their skills to the next level.
Learn more »

My Pluralsight courses — The topics include Unit Testing, Domain-Driven Design, and more.
Learn more »