Stop worrying and write tests

You're probably familiar with the idea of the test pyramid, telling you the ratio between different types of tests in your codebase. The foundation of the pyramid are unit tests — small, focused, fast, and checking all branching and special cases. One floor up you may get integration tests — checking how the individual units interoperate, making sure your system is in fact a system, rather than a box of disjointed parts. And finally at the very top we have acceptance and end-to-end tests, the purpose of which is to find out if your system does what it's expected to do. There's stuff you can add there, but that's the overall idea.

And this picture makes sense. It can be sanity checked by comparing it to the physical world, like that of car design and manufacturing. You test the materials and individual parts themselves, then you test your engine and suspension in isolation (on dyno, using mules, and so on), until you finally drive the car on a test track to see if it works as a car.

But there's also a problem with this picture, which I found to be a major hindrance in the adoption of TDD, as well as the ease of writing tests in general. But before we get there, let's first talk about another famous pyramid — the Maslow's hierarchy of needs — because it has a similar problem.

Pyramid of needs

Everyone's heard of the Maslow's pyramid and saw images of it, so I'm not going to waste your time introducing either him or it. The important thing here is that… it's kind of bullshit. Or at the very least, a gross oversimplification.

And don't take my word for it. You can listen to Scott Barry Kaufman, who's an expert on both the published and unpublished works of Maslow, pointing out that:

It turns out that Maslow never drew a pyramid. (…) It’s incorrect how it’s been taught the past 60 years.

It so happens that the truth is, as you may have guessed, a lot more complicated and doesn't fit nicely into an infographic. An important lesson for sharing memes on the Internet (or on buses... #Brexit), I suppose...

While the idea that water, food, and shelter take precedence over money and self-actualisation sounds reasonable, the actual structure of the hierarchy of needs is dynamic in all kinds of ways.

It changes from society to society, from person to person, but also from time to time. It's different during war from during peace. It's different for a European and a Middle Easterner. It may be different for a 90 year old from what it's been when they were 9, 19 or 49.

And it's not even that difficult to demonstrate. Take Greta Thunberg vs your average CEO of a fossil fuel company. It may be a stretch, but I'd argue their hierarchies would look strikingly different when mapped. At least if they were actually honest... Greta's pyramid would likely look more like the standard picture, with "having a habitable planet" at the bottom, accounting for most of the surface area. Our oil company CEO's pyramid would probably have that at the top, meaning "I might eventually get to that, once I make enough money or it becomes a nuisance".

Now you might argue (rightfully) that this betrays my political biases, but even if you're more sympathetic to the CEO than to Greta, you would likely come to the same general conclusion regarding the difference itself. It would just differ differently.

Similarly, I have a colleague who's life situation is not that different from mine, objectively speaking. Similar age, similar income, and more similarities left to list. And yet, while they've recently decided to move into the mountains and limit their cost of living to an absolute minimum, I have recently visited a Lexus dealership to ponder the idea of getting a cushion on wheels, which I definitely don't need and which makes far less sense than the hatchback I currently drive. I would guess that if we plotted our hierarchies of needs, they would come out very different. While I value piece and security a lot in my mind, it's doubtful I value it as much as they do.

Back to the testing pyramid

It's a similar story with the testing pyramid. I started thinking about this after reading this quote from Kevlin Henney

one programmer's unit test is another's integration test

However, that sounds a bit too personal, so I'll rephrase it for the sake of argument: "One system's unit test is another's integration test". Both of these statements are important, though, especially since we don't even have a good definition of different kinds of tests, especially the unit ones.

We're taught that unit tests are supposed to be testing in "isolation from the outside world", where the outside world is typically understood as everything from other functions and classes, through the database, all the way to stuff like PayPal.

This typically leads to conflating the idea of "units" with "classes" and "functions". And that is where problems start...

Simon Brown often talks about people cargo culting clean architecture, ports and adapters, and other incarnations of this architectural approach, leading to inflating systems' complexity, and I have definitely seen it happen. I think this is related to how we understand units and, by extension, isolation.

Let's say you have a system in Django or Rails. It's possible to add an additional layer into that system, usually referred to as a Service Layer. Its purpose is to isolate your business logic from the framework, and transitively from the database and the web. And that's all good and well, unless your entire system is basically a tiny API for the database. And the actual business logic is kind of inside the database.

A service layer in a system like this would basically be encapsulating vacuum. If that's the case, what good does it do to test anything without the database? You may, obviously, want to test things like validation without it, for performance reasons, but the framework will allow you to do that without any extra fuss. There's no need to build an additional layer into the system for that reason alone.

Disclaimer: it depends

Now... just to put a disclaimer here, because I don't want anyone to get the wrong impression.

I'm not saying that more complex approaches to architecture and design are pointless. What I'm saying is that while they help reduce the complexity of inherently complex systems, they increase the complexity in simple systems. I like to compare that to using an SR71 to get your groceries. It’s an amazing machine, but… no…

There are a couple of prerequisites to be met before you get into anything more complex than MVC, like the system's life expectancy, the complexity of the domain and its importance for the business. Greg Young talks a lot about finding the right spot to do things like DDD and event sourcing, amongst other things. That spot is likely not the CMS.

Find your own units

But then how would do you write unit tests? If I touch the database, there's a whole lot of stuff I depend on and it's certainly not deterministic, so I'm not testing units... right? Well, that's exactly the point — it really depends on your definition of a unit.

What I found very useful when thinking of this is the idea of user stories and Allen Holub's approach to them. I've recently come across this article from an IT team at adidas which has a great quote summing my thoughts better than I could:

Our goal with tasks is not to slice our stories into same-sized units like “takes 1 hour of work”. Instead, we use our tasks to break stories down into their natural, smallest units. (...) A task is a logical unit of the story, can be worked on independently and is actionable

The above is a marvellous recipe for slicing stories and making estimations easier (or eliminating them, for that matter), but similar thinking should be used when trying to find units in your system. Don't think of them as building blocks of code, but rather building blocks of business logic.

Another way of putting it: think of a flow through the system which has a well defined input and output and it would make sens to put it into a single function, but you're not doing it for reasons of convenience, readability, and maintainability. That's what I found to be the first step to decoupling the idea of units from the idea of functions and classes, without necessarily completely losing that connection to code structure.

Another useful step is to learn Behaviour Driven Development, which again shifts the focus from "units of code" to "units of business logic". Note that while we don't always think of stuff like error handling as something entangled with the business logic, it actually is, so testing for exceptions is still something that falls very well into this approach.

The point here is that your units should be indivisible from the logical perspective, but don't necessarily have to be from the code perspective, or even from the container (in the C4 meaning) perspective.

Going back to the original example, if your system is nothing but a thin client over a database and does nothing without it, any unit you can think of, which will actually be covering any kind of logic and hence be worth testing, will involve the database. It's inescapable.

Which is why I tend to think of tests in terms of their usefulness, rather than in terms of the testing pyramid. I keep the latter in mind as a sanity check, along with all the definitions of units I've heard over the years, but just that. In the past I found it paralysing to worry about whether the test I'm about to write conforms to this or that definition and where it lands on the pyramid. It's limiting and it's a waste of both time and brain power.

But isn't that slow?

Yes. But it really depends on what you mean by slow. There was a thread on Reddit which talked about how it takes tens of hours for Oracle DB tests to run, and you have to run all of them after each minuscule change, because the code is so intertwined that you never know what's gonna break and where.

There's no way you're doing TDD in these circumstances, but it begs a question... is the problem that the whole test suite takes days, or is it that you have to run the whole thing regardless of what change you're making?

Greg Young has this great quote which also rings "UNIX philosophy" to me:

I don't believe in big systems. I believe in lots of small systems.

Think of all the apps you have on your computer right now. From the tiny little commands like "ls" and "top" all the way to the IDE and the web browser. How long would it take to run tests for all of them and how would that test suite grow if all of these applications were actually interdependent? At best, it would be long hours, if not days or weeks.

See, there's more to separations of concerns than keeping your units from touching the database at the cost of testing nothing but the quality of your mocking. The more important bit is to make sure that you can confidently make changes without the fear of the Butterfly Effect.

Sure, it's great if you can run the whole test suite in a matter of milliseconds, and it's worth pursuing, but that may or may not be feasible or sensible. What's always possible is to make sure that the impact of your changes is isolated, so you don't have to do that. And it may or may not have anything to do with decoupling from the DB or from the framework.

And it has that nice side effect of limiting the size of the (sub)system you have to fit into your head when working on a piece of code. Which is another way of saying, limiting the possibility you make a mistake because you miss a critical bit of information. Just for the record, I'm not talking about miroservices here... More about the "modular monolith", aka "monolith done right".

Now, you may be thinking that introducing a service layer and repositories and so on is a way to achieve that separation of concerns, but it’s really not. It may be a part of it, but I’ve seen it make it worse when misapplied enough times to know that it actually takes more consideration…

I know I’m being vague here, but I’ll be sure to revisit this in the future. In the meantime, you can watch Simon Brown, Axel Fontaine, and (especially) Greg Young to have something to think about in this regard.

Where that leads us?

Writing tests can be hard. Learning to do it is harder. TDD makes both easier, but if you get trapped into thinking in terms of tests following your functions, methods, and classes it becomes all too easy to fall into what David H. Hansson calls "test induced damage".

You end up locking the implementation details down by doing silly things like writing tests for classes or functions you want to extract, just because you want to "test first". I say it's silly because I've done it and it very much was.

The conclusion here is simply that just like the Maslow's pyramid is a myth and a simplification of a very complicated subject, so is the testing pyramid and so is the common understanding of unit testing. They're the Newtonian mechanics (if not the spherical cow...) of their respective fields. They represent a form of common sense but, as Sean Carroll likes to say, "the real world is messy" and its good to keep that in mind when trying to apply them to real work problems we're solving.