Some well-established truths of software engineering are becoming less true as AI tools mature. One of them: technical debt always accumulates until a system that was originally well-designed, modular, and extensible turns into a legacy mess of undocumented, untested special cases, to the point where replacing it becomes more economically sensible than maintaining it.

One might think: this is normal. All things age and degrade over time until they have to be replaced. But this is not true of software. Yes, hardware breaks or becomes obsolete, but the software running on it can be transplanted, without loss, to different, newer hardware. Software is not material, it is pure information, which physics tells us is preserved under all circumstances. (Well, that is simplifying a bit to make a point.)

What breaks software is not physical degradation but complexity. Some of that complexity comes from new features and requirements. A lot of it, however, comes from technical debt: Old code that was never restructured to accommodate and reflect the new requirements, features that were implemented without taking sufficient time to understand how they affect the system architecture, tests that were never written and cannot now protect from regressions into old errors, which then result in even more code to circumvent them without breaking everything else. This is undesired, accidental complexity. To compare with the physical degradation of material things, technical debt is the name for the degradation, the rot that affects software, albeit the bits and bytes themselves do not decay.

Relentless and Cheap

AI has two properties that matter here: it is relentless and it is cheap, much more so than any human mind.

As a relentless agent, it will explore codebases and log files for obsolete code paths, untested edge cases, and simplification opportunities, explaining obscure properties of system behavior and asking smart questions to either make them more explicit through architectural choices or documentation. As it gradually drives test coverage towards 100%, it builds the foundation it needs to fix those ten thousand warnings your compiler spat out, the ones the human development team only scanned superficially and then discarded because the next feature deadline was approaching fast. It will analyze those priority 4 bug reports and feature requests that never made it into one of the sprints.

As a cheap agent, it scales. Running a hundred such agents doing all of those things at the same time on all parts of the codebase is viable where hiring another one hundred human developers was simply not economically possible. Does AI replace those one hundred developers? Not at all. They were never going to be hired in the first place. But now the work may actually get done.

Imperfect Tools, Reliable Results

One property that AI doesn’t have, as we have all learned by now, is perfection. Say “hallucination” with me. But that’s not the only cause of errors AI can make. When task complexity exceeds a certain limit that is specific to each model generation, thinking appears to break down and the quality seems to fall off a cliff. Some models have particular flaws, such as being notoriously bad at understanding geometry, which makes them bad at UI design tasks. These are just examples. Suffice it to say, AI tools are not immune to making mistakes.

However, software engineering as a discipline has come up with approaches to handle (human) error that extend to the new AI world. For example, Test-Driven Development is the practice of writing a test before writing or changing the actual code that implements a certain behavior. TDD has remained an ideal that rarely gets reached because tests are just one of those things that often get skipped under pressure, resulting in technical debt.

AI agents, being relentless and cheap, can work this way. And because a good test is binary, it succeeds completely or it fails totally, and so it can be assessed by algorithmic, non-AI code. With the right harness, an AI can build tests reliably, and it can modify code reliably without breaking tests. Now this doesn’t protect against all problems: some tests must be broken to progress. Tests do not protect against Heisenbugs (bugs that don’t appear in the presence of a test), and they don’t reliably discover undesired emergent system behaviors such as race conditions or resource contention. And of course, all components passing their tests doesn’t mean the system as a whole is correct. But tests go a long, very long way toward reliable engineering. And who knows, once the basic test coverage is high enough, we might proceed to task our AI tools with more sophisticated techniques, such as creating test harnesses and system proofs that discover some of those pesky difficult edge cases.

Software Can Outlive Its Creators

So what does this leave us with? Well, we know that software systems can be maintained for very long periods of time. Consider the Voyager probes that were launched in 1977, almost 50 years ago. Despite running on literally decaying hardware, their software has remained operational to this day. I am not comparing the genius minds of the Voyager space mission teams to some AI coding tool you just downloaded from the Internet. The point stands: With sufficient care, a software system can survive a very long time. And even if funding and staffing are not NASA’s, you can maintain and extend your systems for extended periods of time with the help of AI tools.

The Migration Gamble

But is it worth it? Why not just build a new system or use a SaaS solution? Of course it depends. Current large-scale legacy systems often date back to the late 1990s, when Java entered the picture for enterprises. Some roots may reach back as far as the 1970s, with IBM System 360 being the original foundation. And your data analysts probably use Python libraries whose algorithms were first implemented in Fortran, then ported to C++, and now to Python. That is code refined over six decades. Whether 30+ years of compound expertise, tailored to your business, your customers, is worth salvaging is not for me to decide.

However, migrating to a new system (and really it will be a family of systems, with a plethora of integration points) always poses a significant business risk. Plenty of those migration projects fail or end up having yet another system accumulating technical debt next to, not instead of, the legacy. Often, they cost tens or hundreds of millions, take years to deliver, and delay much-needed business innovation. The very premise of cutting ties with the encumbering past and speedboating into a bright, innovative future ends up being postponed and tied down by the process of migration and integration.

Now it can be argued that those tasks can also be accelerated and stabilized with AI tools, and that is true. However, there remains significant business risk. Just think of the change management in the business itself: How will your employees, your processes, your customers adapt? With legacy modernization now becoming a real alternative, why take that particular risk in the first place? I cannot give you the answer in your specific situation. But I can tell you that:

Software systems no longer have to die of accidental complexity due to accumulating technical debt.

What About the Humans?

Does all this mean that the original team maintaining a system will be replaced with AI tools as well? As I’ve explained in my article on plurality and AI, I don’t think so. Constructive disagreement is the force that drives innovation, and AI has a hard time emulating thousands of humans doing just that. My advice: Foster a safe team culture where voicing disagreement is encouraged, especially when it is constructive. And of course, encourage your teams to leverage AI tools where they can. It will be very expensive not to.