The Cycle of Continuous Improvement

Feedback

noun

1. information about reactions to a product, a person's performance of a task, etc. which is used as a basis for improvement.

2. the modification or control of a process or system by its results or effects, for example in a biochemical pathway or behavioral response.

— Oxford Languages

Everyone is intuitively familiar the idea of feedback and yet, you will find me beating on that particular drum repeatedly. Why? Because while we have a vague understanding of the idea, we’re usually terrible at putting it into practice. Individuals are terrible at it, collectives (teams, organizations, corporations) are even worse.

Probably the most important thing to remember about feedback is that size matters and small is beautiful — the shorter your feedback loop becomes the quicker you’ll learn. The idea is to always do enough stuff (whatever that may be) to learn, but not more. And yes, striking that balance is hard but we have a natural tendency to go for more, which means you’ll want to err on the side of less. And no, that doesn't make you lazy, just reasonable.

The other thing, which conveniently forms the reason for the first one, is that you will be wrong. Almost universally so. And I’m not at all unique in saying that. At this point I could quote a couple of books with “Lean” in their names or make a joke about Jeremy Clarkson saying “how hard can it be”, but instead lets refer to something with the “last century authority” and a serious name to match — The 1968 NATO Software Engineering Conference:

The design process is an iterative one. I will tell you one thing which can go wrong with it if you are not in the laboratory. In my terms design consists of:

1. Flowchart until you think you understand the problem.

2. Write code until you realize that you don’t.

3. Go back and re-do the flowchart.

4. Write some more code and iterate to what you feel is the correct solution

— Holis A. Kinslow

The quote above is what sparked my interest in the history of our industry, which in turn led to the realization that we’re horrible at remembering things and learning from the past, but absolutely excel at reinventing.

It describes iterative design. You must’ve heard of these before, but if you’re anything like me, you assumed they're a relatively new invention, dating somewhere around 2001 with the Agile Manifesto. Turns out some people understood and phrased that particular notion very well at least a year before Apollo 11. Fifty years later many companies (as well as individuals) still struggle at comprehending and embracing it, even though it’s been repeatedly reinvented and shown to work.

The reason is purely psychological. While people tend to nod vigorously when presented with the concept, it is counter intuitive and simply feels risky. If you really listen to Kinslow, he seems to suggest that you’ll almost always be wrong. Moreover, you’re not only supposed to embrace this reality, but actively try to prove yourself wrong.

People don’t like thinking that way about their ideas — others are obviously wrong all the time, because others are idiots, but not me. So I better get it right the first time, or else I'm forced to think of myself as an idiot. And that’s how paralysis by analysis happens — you never actually do stuff, because you’re too busy predicting what could go wrong and preparing for it. You’ll either build an over engineered blob or absolutely nothing. I'm not sure which is worse.

But changing that mindset is one thing. It has to be followed by something else — creating a safe space for inevitable mistakes. And I’m not just talking about people not getting fired. That goes without saying, but limiting the chance of blowing production up with your little experiments is equally important. The goal is to get to a point where you perform tens, hundreds or thousands (depending on your scale) of those experiments a day without ever blowing anything up. Spoiler alert, you're not going to get there with a dedicated QA team doing ever more manual testing on "dev complete" code. The only way you can do it is by building quality into every single step of the process. Which is what our end goal here is.

The Process

All the above is just half the story. The other half is the tools — a process to guide our actions. Luckily that process has been there since the 1950s at least. It’s called Kaizen and PDSA.

Kaizen (改善 or カイゼン) literally means “change [is] good”, which is definitely not something most organizations seem to be living by. The common philosophy is more like “pour more concrete on it”. Where by concrete I mean bureaucracy. The kaizen idea is very simple and it works well with all the stuff we’ve talked about so far. You look at your current situation, you decide on the smallest change you could make to improve it and you make that change. No grand plans.

Lets say you want to start exercising. The Grand Plan approach would be to buy a gym subscription for the whole year, use it once, skip the next visit because reasons, feel weak for skipping, and admit failure. The kaizen approach would be to do a pushup today. Like literally one. And another one tomorrow. Over time, move to two.

There are a couple of psychological issues here that you need to overcome to start living by this. Not the least of which is admitting that your will power is limited and unless you’re insanely privileged, you spend most of it just getting by — going to work, making money, raising kids, not giving in to road rage etc. The other one is that we can clearly imagine the outcomes of going to gym for a whole year, but it’s harder to see what good a pushup would do.

The kaizen philosophy is supported by a more formal process, derived from the scientific method and called PDSA (“Plan, Do, Study, Act”). It’s not entirely clear (to me, at least) whether kaizen and PDSA directly influenced the development of feedback driven methodologies in software, prior to Mary Poppendieck’s Lean Programming, but wrapping my head around PDSA helped me understand iterative development, agility, lean, TDD, and so many other things. It was transformative enough that I believe it will help you too.

Before we jump into the details, let’s get the high level view by analyzing grocery shopping in the PDSA framework. This is a very simplified example, but it catches the essence of it

  • Plan: Going grocery shopping on foot with a bag will be quick.

  • Do: Execute the plan.

  • Study: Legs hurt, hands hurt, found it hard to fit all groceries in the bag.

  • Act: For medium shopping buy a trolley, for large - take the car.

  • Plan: Going by car will make large shopping less painful.

  • Do: Execute the plan.

  • Study: Found it hard to find a parking spot. Noticed the store is open until very late.

  • Act: Adjust timing.

  • Plan: Going by car at 8:30 pm will make it easier to find a parking spot.

As you can see, it forms a feedback loop in which we can identify four distinct steps. Now, as you’ll see once we get to separation of concern, that’s extremely important and that’s why making PDSA your second nature is a good idea. You identify the stages in your process and you don’t allow them to overlap too much. This gives you clearer focus, and limits the tendency to fall into multitasking and to question the previous stages before information from the world is available. At the same time, keeping kaizen and the idea of short feedback loops in mind, you make a single iteration as small as possible.

Now people sometimes think that using a framework like this, with discrete steps, would slow them down. To which I usually reply that the US Air Force uses a similar cycle: OODA or Observe Orient Decide Act. This is basically PDSA rotated by 90 degrees, with Decide standing for Plan, and it was created in the 1970s by USAF colonel John Boyd for jet fighter pilots. It allowed them to outmaneuver and dominate their opponents, and proved so successful that it directly influenced the design of cockpits.

Let’s examine the PDSA steps in detail now. As I said, all of this may feel painfully obvious, but making it second nature is paramount.

Plan

The first step is Plan, but it’s not meant to be a Grand Design taking weeks, let alone months or years. To the contrary, it should be very constrained, preferably to a single hypothesis.

Lets get back to Kinslow’s quote: “Flowchart until you think you understand the problem”. He’s not talking about ironing out all the details, just getting a general idea of how you'll solve the next problem. It will get progressively more precise and complete with subsequent iterations.

Hitting the right balance is an art, but the Plan step for an entire system or a large feature should ideally be measured in hours. In limited cases, days. Weeks, let alone months, are definitely too much. For a single data flow it’s the time required to write a test, which should ideally be seconds (a few minutes tops). Again, find the smallest piece of design you can learn from and focus on that — no distractions. And apply PDSA to the actual process itself. Reflect on how long it took and whether it felt like the right time or not. With experience, you will find these time boxes getting smaller for the most part.

Before we move to the next step, allow me to smuggle in a bit of language geekery. Personally, I find it helpful to substitute the word “plan” for “hypothesize”. It helps me focus on the true nature of what I’m trying to do, which is not create a final design, but rather validate a well defined assumption and learn from that. Which means I need to set everything up in such a way, that it’s blatantly obvious whether my assumption was true or false once the cycle is done. This helps limit the scope of the experiment and makes learning easier.

Do

In the Do step we execute the plan. This seems self explanatory but it’s good to reiterate that the intention behind Do is never to produce an end result, but rather to gain knowledge. We’re trying to shrink the interval between an idea (or hypothesis) and validation against the real world.

In anticipation of the next core idea, and since we’re naturally driven to finding physical analogies to the intangible stuff that is software, this can be thought of as equivalent to building scale models for wind tunnels, running computer simulations or building mules — Frankenstein’s cars with the body of one car and tech, interior, or engine from another, used to test these bits in real world conditions. In software terms, the smallest “Do” step can be writing an implementation for a single test. Or creating a mockup or a prototype for a UI.

You may be thinking: "wait a second, so where does the final product come in?! You keep talking about prototypes, but there must be an end to this!". And the answer, which I will elaborate on greatly in "Programming as design", is that it doesn't. Thinking of wiring software as "construction" is a harmful misconception. It's much more useful to think of it as an infinite design process. But let's not get ahead of ourselves. 

As another bit of linguistic trivia, there is a proposed improvement to the PDSA cycle by Lépine Kong, who replaces "Do" with "Experiment" to make that kaizen spirit more explicit.

Study

This is the most interesting step historically and linguistically, so I'll spend some more time talking about these aspects. Especially since I believe this to be important for deep understanding of the cycle.

The PDSA cycle was formalized by William Edwards Deming, an American statistician first working for the US Army and then teaching in Japan in the 1950s. The inspiration for the cycle came from another statistician, Walter Shewhart, which is why it’s commonly knows as the Deming-Shewhart cycle. The concept was quickly adopted by Japanese universities and industry; and became the foundation for Kaizen, Taiichi Ohno’s Toyota Production System, and finally all things Lean. As a side note, while Deming doesn’t get all the credit for "the Japanese economic miracle", he was awarded a high order in the country and has a prize named after him.

You may have encountered a variant of the cycle we’re talking about, called PDCA, where the C stands for Check. If you think about it, Check and Study can be considered synonyms but they have a different vibe to them. Check is quick. You check, you’re done. A check can be as simple as a glance: Is there a button? Yes, checked. Study, on the other hand, has a very different meaning. Study is meticulous, deliberate, slow. Study implies gaining deep understanding.

It’s the difference between checking what kind of material your pants are made of versus studying the environmental and social impact of producing that particular material in that particular color. In other words, check is how you avoid regression, but study is how you improve.

That was the intention when Deming first introduced the cycle of continuous improvement. You may find information online that PDCA and PDSA are two distinct, purposefully created tools aimed at different scenarios, but that’s simply not true. It's an ex post facto rationalization. According to Lépine Kong, who does a lot to promote this information, PDSA expresses Deming's original intention, while PDCA is a distortion, resulting from people’s tendency to cut corners, and it pissed Deming off beyond recognition.

The fun part is that the first people using Study and Check interchangeably may have still had the original idea fresh in their minds, so it didn't make much difference. But it definitely does now, and the cycle is widely misused and misunderstood as a result. It's an interesting lesson about passing ideas on in the form of acronyms, clever names, and mnemonics which is something our industry does all the time. It’s destined to become a game of telephone. Without the background and deeper study, you lack error correction and may get the wrong idea. The greatest examples of this, from our own backyard, come from TDD and DRY, both of which are powerful tools when understood correctly, but extremely detrimental when abused. And we will talk more about this soon.

Act

Finally, we get to the last point, where we apply what we’ve learned from our experiment to improve our process and, by extension, our product. This step is the most vague, because acting on the results could literally mean anything, but I’d like to bring your attention to a couple of important points about applying this step and PDSA in general.

When researching this, I came across multiple explanations of PDSA advocating skipping this step if “the experiment has failed”. What they meant is that you run a PDSA cycle on the side, to experiment with a new approach, and if that approach doesn’t produce the results you’ve expected, you don’t act on it. You don’t introduce it as part of the regular process and the process remains static. This is missing the whole point of PDSA.

First of all, the only way an experiment can fail is if you learn nothing from it, which is kind of impossible… You may not have learned what you wanted to learn, but that's a valuable lesson in its own right (even if just about your PDSA proficiency). The real point of PDSA, as with any feedback-driven process, is learning. Improvement comes from learning.

Secondly, PDSA is meant to be the foundation, not a side dish. It’s kind of funny, because the idea here is that quality products and profits are kind of a side effect of sanity, learning, and constant improvement. And I just can’t help but stick the Toyota looms story here, so let’s go with it…

Before Toyota started making cars, it made automated looms (and actually still does). At some point, the designs for these were stolen, to which Kiichiro Toyoda responded that his company makes progress so quickly that the stolen designs are already obsolete. And that without the philosophy and failures that led to them, the thieves will never make progress. Thus, no need to worry about a sudden raise of competition. And I’m going to go as far as to say that any company which can’t say the same, especially these days, is doomed.

Another important thing to remember is that technical problems, which we’re attracted to and focused on as programmers, are the easy ones. Usually, if you look close enough, you’ll find that the Act stage should be applied to people, interactions, and processes. Communication problems, lack of common language, diverging ideas, and wasteful processes are much more important issues to look into when studying each iteration than whatever happens in the code.

Fractal

Ok, so now we’ve finished our cycle and are ready to start a new one, but there’s a twist here. You see, it's not a single cycle. It's cycles within cycles ad infinitum.

PDSA is supposed to be a fractal. A step can be a single operation that requires no further granularity, but it doesn’t have to be. The example I like to use, which most programmers can relate to, is TDD cycles versus Scrum.

In Scrum you have planning, sprint, retrospective and the retro’s action points. That last bit makes the Act stage a bit implicit, but the other three directly map to plan, do, and study. TDD, on the other hand, is usually explained as red, green, refactor. That looks like one step short, but in reality refactor is study and act packed into one word — which, by the way, is why I tend to teach with PDSA instead of RDR. So both can be thought of in terms of PDSA, which is why understanding PDSA helps you understand both Scrum (or better yet, Extreme Programming) and TDD.

A typical scrum sprint has a scale of a week or two. TDD, on the other hand, happens within a sprint and each iteration takes a couple of minutes. That means multiple tiny PDSA cycles within a single Do stage of a larger cycle. Refactoring is another example, because while it can be a single operation (like renaming), it can also be a couple of even smaller PDSA cycles (not to confuse with a refactoring sprint!).

Plan, again

After a single iteration is finished, comes the time to decide on the next plan, and this is also where the idea of Kaizen and having a single priority comes in. Once you’ve reflected and acted on the experiment, what is the new best way to use your time? What next experiment should you perform? Is there a specific pain point you should address? Most importantly, should you keep iterating on the same thing, or is it good enough?

Let’s say you’re doing TDD. After you’ve gotten a single test to pass and refactored the implementation to your liking, is there another test to write and implement for that particular feature? Maybe you’ve come across an obvious edge case or vulnerability you feel needs to be addressed? Maybe you feel like the single test doesn’t give you enough confidence and you want to harden your implementation by adding more? Or maybe it’s fine and you should move to the next “unit”.

This need to keep track of the big picture sometimes leads to adding an extra “Observe” step at the beginning. For me, though, that’s just another manifestation of the fractal nature of the cycle. Cycles within cycles, each focused on getting feedback and learning at a different scale. After finishing a cycle, you just shift your focus to the encompassing cycle and let that guide you. That said, if you feel like having an explicit "Observe" step works for you, by all means utilize it.