Sander Hoogendorn

By Sander Hoogendoorn

December 2025

Upcoming events by this speaker:

A short tale about dumping kerosene and building the right things right

Every once in a while, every manager in every organization gets the brilliant, yet uncanny, idea to measure the productivity of software delivery—or, to put it more precisely, to measure the individual productivity of software developers.

Not long ago, standing outside in the early spring sun, I enjoyed a pleasant conversation with the COO of one of my clients. Again, measuring productivity was the topic, as in many similar conversations I’ve had with managers over the past thirty to forty years.

Counting commits


“I think we should be able to measure who’s good and who’s sufficient, or even not so good,” the COO philosophized. “What if we count the number of lines of code they write,” and seeing my surprised reaction, “or perhaps better, the number of commits they do. That should give us a clear indication of their productivity, right?”

Well…

“There are a few things you should consider,” I replied. “First of all, the code being written only covers a small part of the time spent by software developers. Most of the time goes into discussing matters with the business, contemplating and designing a solution, which mainly takes place in their heads, and let’s not forget meetings and stand-ups.”

“Secondly, and I think even more importantly, developing software is not an individual sport, like tennis or even chess; rather, it is typically a team sport, more akin to volleyball or hockey,” I continued. “How can you measure productivity based on lines of code or commits if multiple people – sometimes two, but often even more – have collaborated on the solution?”

Dumping kerosene


But most importantly, I forgot to add that there’s this thing called Goodhart’s Law, which can be freely interpreted as: when a measure becomes a target, it ceases to be a good measure.
A good example of Goodhart’s Law was presented to me by a friend during a recent conversation. At one point, the Royal Dutch Air Force aimed to ensure sufficient flight movements at air bases in practice and decided to measure the amount of kerosene being used. More kerosene indicated more movements, which suggested a more active air base. It sounds simple, but in practice, this measure led personnel at air bases to dump kerosene instead of flying more, resulting in highly polluted grounds.

In the case of developers, let’s assume that an organization would simply count the number of commits people make and consider more commits as indicative of greater productivity. Developers might then break their code into smaller parts and check them in to appear more productive. Although smaller commits are, in fact, more productive, counting them is still not a good measure. Developers will soon begin dumping kerosene. 

The question is: can (and should) you measure software delivery productivity, and can teams improve themselves using measures, instead of adhering to Goodhart’s Law?

The DORA metrics


This is where DORA comes in.
DORA is a widely recognized set of metrics that has become an industry standard. Developed by the DevOps Research and Assessment (hence DORA) team, it is based on years of surveys and studies involving thousands of teams worldwide. This research consistently demonstrates a strong correlation between high performance in software delivery and improved organizational outcomes, including increased revenue, market share, and customer satisfaction.

In other words, the more skilled you are at delivering software while balancing speed, quality, and stability, the greater the benefits for your organization.

This effect is evident and direct in my current role as CTO at iBOOD. The faster and more effectively our tech teams deliver, the higher our revenue and customer retention rates will be, and the better the quality of our business processes will be. 

DORA introduced four key metrics that distinguish high-performing teams from the others. These are:

  • Deployment Frequency. How often does a team deploy code to production?
  • Lead Time for Changes. How long does it take a commit to reach production?
  • Change Failure Rate. What percentage of deployments fail?
  • Mean Time to Recovery. And how quickly can teams recover from such a failure?

Together, these four metrics clearly illustrate how quickly, safely, and reliably teams can ship changes to users.

Let me give you an example of that. About two months ago, our tech board—the group of stakeholders from around the company that sets the priorities for the tech teams—decided that we should conduct a two-week proof of concept for building a more intelligent search for the website and app. We spent a week and a half experimenting and developing it, followed by a few days testing it with a smaller group of people. The new search worked, and we forwarded the results to our customer. The effect was immediately visible in our revenue, although it is hard to measure precisely, as we have new deals daily.

The DORA metrics have been an excellent guide.

Over time, our working methods have significantly evolved through consistent examination of our status on the DORA metrics. We shifted towards smaller commits and automated our pipelines to the extreme. We are currently investigating ways to speed up automated unit, web, and performance tests, aiming to shorten the running time of our pipelines.

It’s essential to recognize that the DORA metrics serve as a starting point, albeit an excellent one.

Recently, a friend applied for a tech role at a government agency. When she was asked to have an introduction day with the team as part of the application process, they proudly informed her that they release new features in production once a year. Let that sink in. Once. Per. Year. And they were only building an employee portal. For teams like these, investigating DORA and understanding why these metrics matter is a significant improvement.

However, for elite teams, such as those deploying dozens of times per day, the DORA metrics plateau in value.

Now what?

In short, at iBOOD, we are building a new e-commerce platform that aligns with the organization’s goals and strategy. We are building it using a micro-everything architecture, where the entire landscape, including customer-facing software, consists of approximately 180 small repositories. We utilize trunk-based development, so we don’t branch and always commit to the main branch. Each repository includes its own pipeline, and each commit is automatically deployed to production if the pipeline passes all the automated tests.

We recently estimated that we make approximately 40 to 50 commits each day. Our pipelines run for approximately 10 to 15 minutes before reaching production. Not all of these commits are flawless; we also make mistakes. Approximately 20-25% of our commits contain imperfections. Colleagues report these issues on our #ask-tech Slack channel, and we usually investigate and improve them immediately, leading to the next commit.

We have high-performing teams. I certainly think so.

Given our current situation, DORA no longer serves as our north star. We can’t improve much further using its metrics. Of course, we can attempt to reduce the running time of our pipelines even more—we are doing that. However, cutting our pipelines from 10 to 8 minutes will not significantly enhance our performance. So, what’s next?

Building the right things

My take on this is that with high-performing teams, where we build things right, is to investigate how we can build the right things

Given the fact that for our tech teams, there will also be more topics people want from us than we can handle, and that the relevance of those topics shifts quickly and heavily over time, the most crucial question to be answered is whether and how the time we spend adds the most value to the business.

To properly discuss and compare the value of all the topics we spend our time on, we established a committee, known as the tech board. All major departments within the company are represented in this board, and we meet every two weeks.

Whenever someone introduces a new topic to the tech board, they briefly pitch it. During this pitch, it is essential to emphasize the value a topic will deliver to the company. This value can be expressed in higher revenue, better retention, more items per order, or by automating some processes more effectively, which might save work in other departments.

For every topic that comes in, we ask ourselves the following questions:

  • Does it help us get closer to our strategic goals? If not, we will not spend more time on it.
  • Can we actually do this? Do we have the skills, platform, and technology to proceed?
  • Is it small enough to build? If it is not, the topic would take too much time to develop, and during that time, the teams cannot work on other topics that might arise later and be even more relevant.
  • Do we really need this now? Is this topic the most relevant to work on now? If it is not, we’ll mark it as someday-maybe, that is, something we want, but not now.

Answering these four questions for every suggested topic triggers discussion among the tech board members, validating the added value of the topics against other items on the board. It ensures that our tech teams not only build things right but also build the right things.

In addition to continuously improving the scale of the DORA metrics, this straightforward mechanism for jointly deciding which small topics to focus on ensures that we get the utmost value from our technology.

That is why we can build a complete and thriving e-commerce platform with a tech team of fewer than fifteen people, without many ceremonies or processes other than a solid architecture and fully automated deployment pipelines. No Scrum, no sprints, no additional roles beyond developers, very few estimates, only ballpark figures, no retrospectives, no blaming, and no managers measuring individual developers by lines of code or commits.

Delivering software is a team sport.