Pipeline stalls and the human brain

It’s amazing how often I am stricken by how closely, in some ways, the human brain mimics the behavior of a CPU in day-to-day work and interaction with other people or organizations. The analogy works particularly well for optimizing productivity in the workplace (at least I find that to be the case as a developer).

Consider a productive day programming, being in the zone. Very likely, this involves little interaction with others, and few interruptions. In fact, it seems most programmers feel the most productive when they are able to sit down and just work continuously, with focus, for an extended period. In my analogy, this is executing a tight loop with all instructions and memory fetches in cache.

An example of the opposite is when you are blocking on an external entity. Having to interrupt your work to wait for something else (outside of your control) to complete is analogous to a pipeline stall. Examples include:

  • Waiting for the compiler to build your software.
  • Waiting for a deployment step to complete.
  • Waiting for a slow web page to load when searching for documentation.
  • Waiting for someone to answer a question for you.
  • Waiting for someone to give you access to something so that you can complete testing or deployment.
  • Waiting for someone to review something.

It is worth noting that blockages can be either technical or human or a combination thereof.

Given a fixed amount of time spent working, and assuming you wish to be productive, clearly you do not want to twiddle your thumbs while waiting for various things that you’re blocking on. For example, if you have to wait 3 hours for someone to review your code prior to merging to the production branch, you do not want to sit there idling for 3 hours. The solution to this is typically to context switch to something else (in other words, you are engaged in concurrent execution of multiple tasks). Whenever you reach a blocking point, you context switch to an appropriately non-blocked (runnable) task.

What happens when everything you are working on is blocking on something? You start working on something else, increasing concurrency. As is generally recognized, context switching is expensive for the human mind (just as it is for an operating system on a CPU). In this case, as you increase concurrency, you will either be context switching more often (each context switch carrying with it some overhead) or you’re context switching to tasks that you, on average, worked on a longer time ago. In other words, all the caches relevant to the switched-to task will be colder.

Either way, all forms of blocks are expensive in the sense that overall productivity is decreased.

A way to mitigate the problem is to predict future choices or over-commit on options (i.e., branch prediction), thereby causing would-be blocking points to block for less time or not at all once you reach them (i.e., pre-fetching).

While the analogy is fun to draw, the main point here is that blocking on external entities is very expensive and in order to remain productive such blockages should be minimized. It follows that whenever you do absolutely have to block, you want to do so for a short a time as possible. In other words, the latency of external dependencies is crucial to productivity.

Interestingly, this implies that while a single individual or team may be considered most productive when they focus only on the task at hand, reality complicates matters. Suppose you have people or teams that have dependencies on other people or teams in a working environment. Every time you interrupt your work to help someone else, you lose some productivity. But someone else is gaining because you are decreasing latency for them.

I believe that in reality, a reasonable balance has to be kept in terms of quickly servicing external requests vs. focusing for productivity. I also believe that typically, this is ignored completely when arguments are put forth that you should e.g. only check your E-Mail once per day or refuse to do anything other than your main task for an entire day.

People, scrum teams, whatever it is – don’t exist in a vacuum and the greater good has to be considered in order to maximize global productivity.