Words, meanings, and incentives

Words. They're kind of important. We use them to convey meaning, to communicate with each other. One of the most important steps in building a computer system with a remote chance of standing the test of (change over) time is establishing a ubiquitous language - one that everyone working on the product agrees on. Language and words should be of utmost importance especially to programmers, because our trade is precisely building stuff out of words. Which is why it's so baffling how bad with language we typically are. And I'm not even talking about sloppy, lazy naming like data or utils. I'm talking about much more high level stuff... And there's no better illustration of it than...

CI/CD

Now... I know that language evolves, I get that. Words lose and gain meanings all the time. Incorrect usages of words become accepted over the time. And while it literally pains me that "literally" can now be legitimately used figuratively, I try to accept that. But again... as people who build stuff out of words, we should know better. And more importantly, we should know that some of those misnomers and misconceptions are actively harmful. CI/CD is one of those.

There's no such thing as "CI/CD". What people almost universally mean when they say "we have CI/CD" is a pipeline. And having a pipeline in place does not imply either continuous integration or continuous delivery, so neither should the term we use to name it. See, I wouldn't have a problem with it if they said "we have Steve", because while it also makes no sense, Steve doesn't imply CI - CI/CD does.

Continuous Integration, or CI, is the act of keeping your code integrated continuously. Which means there are no divergent changes sitting in the repo and evolving separately for prolonged periods of time.

People often quote Kent Beck and Dave Farley about how CI is "merging once a day", but that's a misunderstanding. CI is merging at the very least once a day. It's the bare minimum to even qualify, not the universal speed limit. It's as if you were a race driver, qualified with the slowest time and then just called it a day, because you've qualified - no!

What is the end goal? When can you truly say that you've committed to, and are actually practicing, continuousintegration in a way that creates the right incentives? Well...

Continuous is more often than you think — Mike Roberts

At this stage and in the age of Git, the end goal is not branching. Literally, in the non-figurative sense.

You have a single branch. Call it trunk, main, whatever — the point is there's one. And everyone on the team commits and pushes to that branch, without going through feature branches. That's the closest you can get to "continuous" with Git without mob programming (1 commit per team, instead of 1 commit per person). I'm looking very closely at Sturdy in this space, as it could make a huge difference, and I suggest you do that as well.

So CI is about continuous integration... as the name implies. What about CD?

Continuous Delivery, or CD, is the practice of keeping your code operational at all times. Not broken. Without regression. It may have unknown defects (although, contrary to popular belief, not all code has bugs), but it can't have known ones — these are to be eliminated ASAP, and they always take priority over everything else. If a test fails, you fix the problem indicated by that failure before moving on.

Because CD is built on top of CI, the codebase will contain incomplete features. However, the whole codebase, including these half-baked bits, is always in a deployable state — it's always delivered. The ultimate goal here is to be able to hit "deploy" at an arbitrary moment and then close your laptop and go into the woods for a month without a worry in the world. That's what you're striving for with CD. And when I say "arbitrary" I mean it — it's not continuous delivery if you can't deploy right now.

You may have noticed something... I never said anything about pipelines, let alone specific tools, like Travis or Jenkins. Why? Because CI and CD are practices not tools. They're something you do, not something you use. They're mindsets of building quality into every stage of the software development process.

Pipelines, on the other hand, are tools to achieve that. However, just like the Andon Cord in GM factories, they make no difference without the underlying philosophy they're supposed to support.

CI and CD are well known to be good practices in the industry, which is why almost all teams around the world claim to adhere to them. But then you look at how they work and you see a very different picture:

  1. There are feature branches

  2. These feature branches live for days and weeks and months

  3. They're huge, these feature branches

  4. Merging is a Sisyphean task

  5. There's a pipeline, but there are barely any tests

  6. The few tests that there are are red most of the time, and nobody cares

  7. Going through the pipeline takes way upwards of 5 minutes

  8. Deployments are coupled with releases

  9. Deployments are thus eventful, instead of mundane

  10. The act of deploying is ceremonious and everyone's afraid of its repercussions. They can't deploy right now.

These teams call their pipeline "CI/CD", not realising that all they have is, in fact, a pipeline... without CI or CD. And without the benefits of either.

Trunk-based development

This one's actually funny, because it's one of those terms that pop into existence to fill the void left by another term after it's has been deprived of meaning, and end up not only doing a worse job conveying the point but also watering down the original concept.

TBD came into existence, I believe, because CI has lost its original meaning and started to mean "pipeline". Now... TBD is not a bad name per se, but Continuous Integration is better precisely because it contains the word "continuous". I just wish people paid more attention to that.

Worse, however, TBD allows for the idea of "short-lived feature branches", which is the watering down I've mentioned before. Feature branches are wrong. It doesn't matter if they're long or short lived, they introduce the wrong incentives!

The incentives created by a process is an often overlooked problem, which greatly impacts the results you're getting from that process.

Feature branches are, in most teams, tightly coupled with the idea of gatekeeping code review. Meaning, before a change is integrated into main it has to be examined by another human being or two. That human leaves a couple of useless comments about code formatting or initiates pointless discussions around the code to comment ratio, plus perhaps leaves a single useful comment about naming and then we pat each other on the back and consider quality assurance done. But it isn't... It's theatre.

Code review itself is a great practice — periodic visual inspection is a great idea! And decoupling that from making changes is good, as long as you do it with a clear head and a dedicated focus time. Gatekeeping code review, however, is... dangerous. It's dangerous because it fools us into thinking we're doing a good job and thus de-incentivises building quality into writing code with these dramatic new ideas from the late 90s:

  • Ensamble (pair, group) programming

  • Continuous Integration

  • Continuous Delivery

  • Test Driven Development

  • Acceptance Test Driven Development

  • or, you know, training people

As well as removing the opportunity for pointless flamewars by using automated code formatters, linters, etc.

The whole point of Continuous Integration is to build an environment in which people can commit directly into trunk or main and be certain (to the level allowed by the current state of art and tools) that if they break something, the checks in place will let them know. As well as limiting the chance that they'll break something in the first place.

We're deliberately setting out to do something that, intuitively, feels risky because such approach creates an incentive to reduce those risks. And the reason we're doing it is because we realise those risks are there and we recognise that creating an illusion of having dealt with them increases the danger.

Which brings us to the point about trust and whether all team members are equally trust-worthy. Which is often brought up when I suggest continuous integration.

I've been the senior developer gatekeeping junior devs' code from entering main until it's magically passed the bar of adequacy. And I say magically because all they had to work on were my comments, which (over time...) typically started containing more and more actual code... Because, you know, it's quicker that way.

I've eventually learned that, in cases like these, pair programming is simply more efficient — both at getting shit done, and teaching. Feature branches, even the short lived ones, augmented with pull requests de-incentivise pair programming in favour of what's basically asynchronous "email exchange" within GitHub.

So there you go... Martin Fowler famously said "if it hurts, do it more often". I'd like to add: if it feels risky, confront it and reduce the risks with training and automation, instead of sweeping them under a rug. And there's no worse form of sweeping under a rug than using terms that meant a deliberate effort to mean mindlessly plugging in a tool, and then patting yourself on the back for doing a good job.