At Stepsize, helping high-growth software companies measure and manage technical debt is our business. And this means we get to spend our days with some of the best engineering minds out there.
We've listened to their stories, and analysed hundreds of thousands of lines of their code and contextual data, to identify the signals of technical debt in all the noise. Following on from our broader definition of tech debt, in this article we'll share some examples of technical debt that we've spotted in the wild, to help you identify it in your own codebase—before it catches you off-guard.
Let's imagine we have a company that's developing a new product. Because we're sensible people, we don't over-engineer anything, and we stick with tried and tested tech that we know well. After all, we need a prototype quickly, to test our product and business idea.
We build simple models in a simple SQL database and decide to use integers as IDs. We use AUTO_INCREMENT to create a new ID—so far, it's all straightforward enough.
Now, fast-forward a few years. Our MVP has become outrageously successful and business is booming. We extended the SQL database so that it could handle all the amazing features our users requested, and we still use AUTO_INCREMENT. We don't really have the time to rebuild anything, or do anything fancy, because we're growing like crazy and, like the man said, if it ain't broke, don't fiddle with it.
But one day, after we've poured hundreds of thousands of dollars into our marketing engine, new user registration takes a nosedive. The team scrambles to figure out what went wrong and there's total panic.
Then, one of our engineers comes along and sheepishly says, 'We ran out of integers.'
'Excuse me?' we say, incredulous. 'We ran out of integers?'
We've seen this play out in real life. It's true that had this tech debt been properly managed (i.e., consciously taken on and tracked), it would've been caught before it caused such mayhem. But what makes this an example of 'good' technical debt is the fact that we did not over-engineer the initial MVP by trying to accommodate for a highly-uncertain future.
Back at our hypothetical company, a customer of our system submits a feature request: they would like to see sales figures for regions within the countries we display in our dashboard. Easy enough, right?
So that we can ship something quickly, we decide to use a convention to represent regions as countries with their internal short code
This does the trick. All we have to do is understand this convention in one place in the UI and we can display regions under countries.
After a couple of months, our customers tell us they'd like to see sales commission data broken down by countries and regions across our UI and the reports we put together for them. Suddenly, this simple conceptual change requires us to restructure the domain model in the database, code, and UI, which takes a significant amount of time.
Business requirements change and we often can't predict exactly how they will do so. This is a frequent cause of tech debt: code written yesterday can become a burden today.
Again, there's a right way to manage this technical debt—expectations need to be set clearly with all stakeholders, and we simply need to put in the work to restructure the domain model. But taking it on in the first place was not a mistake. The system we'd designed matched the requirements at the time,which were based on our understanding of the problem at hand at the time. That’s the best anyone can do.
At our imaginary company, we've decided to refactor our codebase into a highly-distributed architecture (the reasons for this need not concern us here). We have microservices everywhere, neatly split up, each with a single responsibility, and their own set of dependencies.
Our company has grown even more and we've gone international. Our system now has to accommodate multiple currencies and we've been tasked with shipping this feature. It seems like the job will be isolated to a specific microservice. The feature is easy to build, and we ship it. Victory!
Or so we think. Alarm bells start going off; PagerDuty is desperately trying to get our attention. The system is down and the on-call engineers scramble to fix the issue. They collectively raise their palms to their faces when we tell them about the cause of the outage.
As it turns out, the microservice we modified hadn't been deployed in almost a year, so no one realised that it wouldn't be backwards compatible with a few other microservices which depend on it. We simply added an extra value (to specify the currency) and modified a few parameters in the data returned by the API. Some of the other microservices which interacted with this API didn't handle this exception, so they just keeled over. Welcome to 'dependency hell'.
Kellan Elliott-McCrea expands on this in his piece, 'Towards an Understanding of Technical Debt' and Srinath Perera explains how to avoid dependency hell. Give them a read—they're well worth it.
Let's now suppose we're working on a codebase using the good old MVCapproach. We've followed the 'thin controller and fat model' paradigm. Our original team of seven engineers has now grown into ten teams of seven and it's sometimes hard to know exactly what everyone is working on, so the codebase has started evolving in different directions without us knowing it.
Growth means more entropy in our codebase, which in turn implies more technical debt. You can take a deep dive into this topic with the simple reasons tech debt is inevitable but in this MVC example, we'll end up with huge models that should be split up—or even with, say, eleven models that should be combined into one.
This is an extremely painful kind of technical debt, as the rest of our system will often heavily rely on the structure of our models. Changing the very foundation of our architecture will have ripple effects across our entire system—and tackling this is rarely just about refactoring models.
Not having enough test coverage is a classic problem. Complex code is hard to change or refactor—and problems are tougher to identify and fix—without automated test coverage all the way up from integration level. The code becomes resistant to change, as Kellan Elliott-McCrea writes in 'Towards an understanding of technical debt'.
What we often fail to consider is that having too many fragile tests that break on every change, and require more maintenance than the application code itself, is also a form of technical debt.
'Conway's Law' drives this type of tech debt, and we discussed it in the second part of our series about the laws of technical debt. Put simply, engineering teams are constrained to designing systems that copy the communication structures of their organisations. This means that, if we change our system's design without also changing our org structure, we’ll be creating technical debt.
As Ankit Sobti explains in his article detailing how Postman worked their way out of dependency debt, splitting teams into Product, Service, and Platform teams was no longer appropriate for them, given their new heavily-distributed system. They had to divide each of these units into sub-units with clear ownership of the microservices within their respective realms of responsibility. Code ownership is the most important cultural characteristic in maintaining a healthy codebase and it's only achievable if our system and org structure are compatible.
When an engineer leaves our company or is reassigned, if our code isn't properly documented, or the handover isn't done properly, all the knowledge they've accumulated about the code leaves with them.
When this happens, the code in question is orphaned, which will make it hard for engineers who adopt the code to understand it. This slows them down and leads ultimately to . . . you guessed it: technical debt.
This kind of tech debt often means having to rewrite entire services, or even delete features that we assumed were bugs when browsing the code.
To illustrate this, imagine working on a pizza delivery product. Users can order pizza online, and a delivery person will cycle across the city to pick it up for them. You come across code that allows an operator to manually override the system and send several delivery people to pick up the same order. Surely that's a bug—why would you ever need to send several people to pick up the same order? So you delete it.
But again, alarm bells go off. This time, it's the company's Head of Operations, whose voice has reached a pitch only dogs can properly hear. It turns out that several delivery people isn't always a bad idea—if a big customer orders fifty pizzas for an office party, one guy on a bicycle won't cut it. You'll need a small army of delivery people to fulfil the order.
Orders from your biggest customers will be lost until you've fixed this, all because a quirk of the code wasn't properly documented and your departing ex-colleague never mentioned it when you took over.
You'll notice that none of the examples above is an excuse for bad code, or for being unaware of best practices at the time the code was written, or indeed for any form of incompetence. As we discussed in our broader definition of technical debt article, even the best engineers in the business can't avoid tech debt, but they do know how to handle the fact that 'software exists in a world of uncertainty'.
A mess is not a technical debt.
We do not want to accept a mess as technical debt. Ignoring good design practices and current coding patterns is unacceptable, and we must concede that the code we write today will need to be modified tomorrow. It is important that we make this distinction, because bad tech debt will spread like a virus—and we can't have that. We built Stepsize to help you avoid this. You can start managing technical debt effectively today - try it out for free!
For a meta-framework to help you think about all these different types of tech debt, I invite you to read about Martin Fowler's 'Technical Debt Quadrant'. He makes the distinction between prudent and reckless tech debt which can each be split into deliberate and inadvertent tech debt.
What other examples of tech debt have you come across? If you had to separate them into 'buckets', what would you call these buckets? We'd love to hear about your experiences with tech debt, so please hit us up on Twitter @AlexOmeyeror @StepsizeHQ.