It's a surprisingly cool Saturday morning, and I am huddled in the living room, trying to focus on writing amidst the background noise of “Minions - The Rise of Gru”, playing on repeat for the fourth time this month. As tempting as it is to get involved, I will replicate my approach to work the past few days.
This week, I've deliberately kept out of my team's hair, burying myself in research and upgrades for some vital platform systems. Aside from the fact that this needed to be done anyway, I've always believed that sometimes, the best thing managers can do is to step back and let their team get on. As I watch from afar, I can't help but notice how smoothly they're progressing on a feature they've been wrestling with for a while. It's fascinating how removing yourself from the picture can produce excellent signals about your team's capabilities.
This week, I want to discuss how we measure team productivity and success in the tech world. Our industry is obsessed with metrics—story points, velocity, lines of code—you name it, someone's probably tried to measure it.
In fact, these metrics often tell us more about our management style (and lack of understanding of how to measure technical teams) than our team's actual productivity. And worse, they can lead us down dangerous paths of varying toxicity when it comes to culture, one of which I wanted to talk about today—"The Cobra Effect."
Credit: https://unsplash.com/photos/a-brown-snake-on-the-ground-near-a-tree-sazD8ZHV_VI
The Cobra Effect: Solutions Biting Back
The term “Cobra Effect” originated from a story dating back to British colonial rule in India. The story goes that the British government were concerned about a high number of venomous cobras in Delhi, so they offered a bounty for every cobra caught and destroyed. Initially, this worked out great—people brought in dead cobras, and the cobra population decreased. Problem solved, right?
Wrong. Enterprising folks soon started to breed cobras, and cash in on the bounty. When the government caught wind of this they then scrapped the program. Story over? Not quite…
All of those cobra breeders now had hundreds of now-worthless snakes. So what did they do? They released them.
The result? More cobras than when they started. 🤦
The Cobra Effect in Software Development
When we measure teams, we need to think carefully about what potential cobra effects lie waiting to be unleashed. That is why metrics like velocity or lines of code are considered terrible metrics - they can be easily gamed, here’s an example of some of the effects these metrics can produce:
Code Padding: To increase the number of lines committed, developers can simply start writing unnecessarily verbose code. Suddenly, what could have been an elegant one-liner becomes a 20-line monstrosity. This metric works the other way around, too—tech debt that should be removed may be avoided to not negatively affect the number of lines added.
Velocity Inflation: Velocity is an internal representation of a team's domain knowledge and confidence in their work. A consistent velocity means the team is stable, and the domain they’re working in is known. However, when used as an external metric, teams can feel somewhat obliged to start overestimating effort or creating many easy-to-complete tickets to artificially inflate their velocity. Additionally, it becomes a “pot” of capacity that management always aims to fill.
Feature Bloat: When measured by the number of features shipped, teams unwillingly start to push out half-baked features that nobody needs or uses, the expectation that users want new features is unfounded, “the dog should wag the tail, not the other way around” (my ops director would say). In a recent study (by the ministry of the obvious) of UK consumers of software it was found the vast majority of consumers rank reliability and consistency of features rather than a shiny new toy to play with - (actually this was a real study, I lack the source right now…)
In each of these scenarios. The underlying metric improves, but the software's actual quality and value suffer. Like those cobras in Delhi, we’ve created more problems than we solved.
A Better Way: DORA Metrics and Beyond
So what should we measure if these traditional metrics are so problematic? Well we need to find a balance between inputs and outcomes, and keep them teambound rather that attributed to individual contributions. Enter the DORA metrics (DevOps Research and Assessment):
Deployment Frequency: How often can you, and how often do you deploy code to production? In my opinion, the measurement of how often you can deploy, is a better measurement to start with as it ensures controls are in place.
Lead Time for Changes: How long does it take to go from code commit to code running in production? A great measurement of the time things are being spent in quality control.
Time to Restore Service: How long does it take to roll back and recover a failure in production environments? Signals that can help maintain SLA bound metrics such as Recovery Time Objectives (RTO, the time it takes to restore) and Recover Point Objectives (RPO, how much data loss there is)
Change Failure Rate: What percentage of changes to production result in degraded service? A good signal on the quality of testing and test coverage.
While these metrics aren’t a magic bullet, they focus on the flow of value to the customer and the stability of your systems rather than arbitrary measures of “productivity.”
But it's not enough to just measure the contributions (or inputs) teams make to software. We also need to keep our ear to the ground on outcomes and validate our decisions. We may be efficient and produce quality software, but how do we know the artefacts pushed to production are effective and resilient?
Here are a few metrics we can use to validate our inputs.
Time to First Use: Once we deploy a feature, how quickly do customers start to use it? This metric signals how well-timed you are at providing customers with what they want when they need it, and it is a step in the right direction to avoid feature bloat.
Time to Customer Value: How quickly can a new feature go from idea to delivering what customers perceive as valuable? Or in other words, what is the Minimal Loved Product, does the feature pass the Slick test (Simple, Loveable, and Complete)
Time to Defect Discovery: How quickly are bugs found after a release? Depending on how you look at this metric, it can reveal both quality of testing, and the health of your stack. Bugs found straight away by customers signals improvements need to be made in testing. Longer time periods signal tech debt as customers find edge cases.
These are just a few output metrics that I love, I'd love to hear about your favourites.
The Bottom Line: Metrics are Signals, Not Targets
Remember Goal → Signals → Metrics.
Signals indicate progress, metrics are measurements of those signals. They guide us towards our objectives, they're not objectives in themselves. When a metric becomes a target, it ceases to be good. Why? Because cobras always find a way into the smallest gaps.
So, rather than obsessing over numbers, create an environment where teams can do their best work. Use metrics to identify areas for improvement and to celebrate genuine progress. But always, always keep the bigger picture in mind.
Remember, at the end of the day, we’re not here to rack up story points or write the most lines of code. We’re here to create amazing software that solves real problems for real people. And no metric can fully capture that.
So the next time someone suggests measuring your team’s productivity by counting lines of code or velocity, maybe share the story of the cobras. Then, get back to doing your best - building awesome software.