Thoughts on AI and software development - Part 4
Side effect and unresolved questions

Thoughts on AI and software development - Part 4
In the previous post, we looked at the likely short- and mid-term consequences if Steve’s projection should become reality. We saw a bit disturbed that most likely the only winners of that projection would be the providers of agentic AI solutions and their investors while everyone else would be on the loser side of the game.
In this post, we will complete our analysis by looking at some side effects and unresolved questions that would come with such a future.
Let us get started …
A leap of faith
Such a projection as Steve made raises some additional side effects and unresolved questions that are worth pondering at least briefly. We already touched in the previous post that the expected software quality would at least become a lot less predictable without any human really knowing what is going on in the code.
To be fair: Depending on how broken and toxic your environment is, this can already be the case today. If everyone around just pushes for features while costs are continually cut and quality is nothing but lip service, the resulting software usually is just a wild copy and paste from stack overflow and alike without anyone caring beyond passing the CI/CD pipeline. In such settings, nobody really knows what is going on and nobody really cares as caring is not wanted – only “faster” and “cheaper” is rewarded.
Nevertheless, passing the whole control to a bunch of AI agents that create lines of code faster than anyone can comprehend them takes it to the next level. You are completely and utterly at the mercy of the AI agents doing what you want them to do – and ultimately at the mercy of the agentic AI providers. If things should not work out as expected, if the code quality should be subpar, if the software should not work as expected in production, there is nobody who can fix that mess – at least usually not in a time frame that your business would survive.
The AI advocates would now say that agentic AI solutions, while not necessarily being perfect from the beginning, would at least raise the lower (quality) bar, i.e., produce less “stupid” bugs. Additionally, the AI agents would become better and better over time. However, at the moment this is just a claim and a company’s current “lower bar” primarily depends on how they run their IT department, to a much lesser degree on the qualification of their software engineers.
In other words: Currently, you can control how high or low your “lower bar” is. If it is stupid low, almost certainly it is due to your mismanagement, not due to “stupid humans that need to be replaced by machines to get rid of their errors”. You can improve the conditions. You can invest in the education of your developers. You control the bar.
If you pass your whole software development over to a fleet of AI agents, you do not have any control over the quality of your software anymore, neither the lower nor the upper bar. As written before: You place the future of your company into the hands of some agentic AI providers.
You may ask yourself if this is not a bit of an exaggeration: “The future of your company”. Hey, we only talk about software development, you may say. Well, that may have been true 30 or 40 years ago.
The continuously ongoing digital transformation makes business and IT inseparable as I explained in a prior post. This has several implications as I also wrote in that post. The most notable ones are:
- You cannot change your business anymore without touching IT. Every new feature, every slight process change requires you to touch IT. But of course, this is only of value if your changed software works as you need it to work – doing the right thing reliably.
- IT has become indispensable. IT outages are not an option anymore. Highly dependable IT systems have become a must. If systems fail, your business stands still. People depending on your software working suffer and you lose money – every second your systems do not work. Studies have shown that most companies do not survive an IT outage that lasts longer than a few hours – not necessarily due to the immediate effects but oftentimes due to the longer-term effects. They are not able to really recover from the outage. It haunts them time and again until they are eventually gone.
This means if your systems should fail, i.e., crash in production, become unbearable slow, be hacked due to security flaws or simply exhibit a nasty functional malfunction, your viability clock ticks. If the clock reaches zero before you are able to fix the problem, you are in existential trouble.
You can outsource this risk to some agentic AI provider. Of course, the provider will tell you in the most flowery words imaginable that you make the right choice, that their agentic AI fleet will actually improve the quality and reliability of your software, that it will make you faster, better and more competitive.
However, these are void promises. The agentic AI providers want to be successful and if you buy their solution, they are a step closer to their goal. Hence, they will tell you anything that makes you open your wallet. You will only see later, if their claims are true – in the worst case when it is too late.
From all that is proven today, it is a leap of faith and nobody can really tell you if you will land softly or crash into the abyss. As I said before: When going for the agentic AI fleets, you outsource the control over your software – and therewith the well-being of your company – to an agentic AI solution provider, at least if you jump all-in into the scenario Steve projected.
All the afore written does not mean that going all-in into agentic AI software development is necessarily stupid. It only means that you should be aware of the fact that you outsource the control over a core component of your company that must be working reliably to survive, and that it is an unresolved question yet if this leap of faith will go well or not.
Who is responsible?
If you are a decision maker in a company where people only demand “quality” from IT but never create the conditions to implement it, where only “faster” and “cheaper” counts, where IT is considered a pesky cost center, it might not only look desirable to turn your software development department into an AI agent fleet but might be a sensible choice. You get software a lot faster and cheaper and you are used to quality hiccups anyway because your whole company makes sure that IT cannot deliver quality.
But in that case you face a new problem. With humans, it is easy to find someone who is “responsible” if something goes wrong, a scapegoat to blame, to punish, to pressure to fix things. With an agentic AI fleet, this well-trained ritual does not work anymore. Whom to blame? The agents wrote the code. Nobody really knows what they wrote. You can put a “supervisor” in place but in the end that person is just an a priori determined scapegoat.
The AI agent fleet produces code so fast that nobody can make sure the code created works as intended. And the rest of the company will make sure to continuously feed it with the same incomplete, inconsistent and contradicting requirements as they always did, just more and faster. And the agentic AI fleet will translate all these requirements into code at a breathtaking speed.
Hence, sure, you can define a scapegoat upfront. You can yell at them if things go wrong. You can punish them. Threaten them. You can even fire them. Execute the whole well-trained and perfected show trial. However, it will not fix things. The scapegoats will exactly know that they are exactly this: Scapegoats. Maybe they play along in your ritual but they will not care. They cannot care because it would wreck them. They will not change anything because they cannot change anything. The AI agent fleet is in control and ultimately responsible for the code it creates faster than anyone can check it.
But you cannot make the AI agent fleet responsible.
As I wrote before: You outsource the control over your software to an agentic AI solution provider, and their license contracts will make clear in painstaking detail that they cannot be made responsible if their agents did not create the code you expected.
And if the AI agent fleet tests its own code to make sure it works as expected? Same story. Still no one to make responsible if things go south.
Again, this does not mean not to do it. It is just important to understand the consequences of taking such a step. Most people simply take the current status quo and apply “faster”, “cheaper” and “better” to it (even if we discussed before that especially the latter is a questionable bet). They do not ponder if we can project the status quo at all or if we need to rethink everything from scratch because we changed essential building blocks – especially when things break and problems must be fixed quickly.
Outside-the-box thinking to the rescue
A particular trait of humans is that they can adapt very quickly and flexibly to unexpected situations. They are able to find creative solutions outside the well-defined procedures. If you should wail now: But people must stick to the procedures! It is not allowed not to follow the procedures! Well, if a bad problem occurs while everyone sticked to the defined procedures, it is very unlikely that the defined procedures will fix it. 1
Very often, it is some creative act bypassing all official rules and procedures that fixes the problem quickly. Even more often, it is this kind of “not allowed according to the governance handbook” behavior that detects and prevents such problems before they happen. If you replace your humans with an AI agent fleet, you will lose this kind of resilience that humans offer. The agents will do what they are told – no matter what the result will be.
From what I see, these are potential side effects, few people consider in their AI plans regarding software development. Steve’s projection also does not consider it. Nevertheless, we will find ourselves in such unpleasant situations from time to time and handling them will be very different with an agentic AI fleet than it is today with humans.
Simply taking the status quo and adding “faster”, “cheaper” and “better” as most people seem to do when making projections is not what we will get. We will get something different and we will need to learn from scratch how to deal with these situations – especially the adverse ones – because it will be different from what we know.
The innovation dilemma
There is another interesting side effect of the scenario, I would like to touch. If you think about the scenario where software development is handed over to AI agent fleets for a moment, the following question will come to your mind:
How shall software development progress if we stop innovating?
This has to do with the cutoff problem of generative AI solutions. They are trained using data that is available at the time of the training. For the current frontier models, this usually is 6-12 months before release. Part of their functional principle is to provide the most likely answer to a question. For coding this means they resort to the type of solution they have seen most often during their training.
We can already observe this effect now. If you, e.g., ask some of the frontier models for a Javascript-based frontend using Svelte, they will often suggest using React instead and then show you a solution written in React. This is simply because they found React-based solutions so much more often in their training corpus than Svelte solutions.
Now imagine, Steve’s projection becomes reality and AI agent fleets will take over programming (almost) completely in the next 12 months or so. This means, from that point in time, most code will be written by AI agents backed by the aforementioned frontier models. Even with a newer generation of those models, their cutoff date will most likely be around the first half of 2025. Those will be the last models that were trained based on code written by humans – incorporating all the progress software development made over the years due to new ideas humans developed.
All subsequent model generations will be trained on code they created themselves. This means, all future training will reinforce the solutions of today and before – at least if we stick with the current working principles of generative AI solutions.
You could argue now – and most of the AI advocates for sure will do – that the AI solutions will find their own ways to create progress, like DeepMind that developed completely novel Go strategies when playing against itself. However, this would require a quite different way of setting up and training these models. Additionally, there are some practical problems to overcome if trying to take this route. E.g.:
- Go is trivial regarding the possible actions. You can place your token on 361 positions at max. In writing code, you have almost infinite actions.
- In Go, we can derive a reward function (I mean, we talk about reinforcement learning here) from a quite simple fact: The game is won. But what does “winning” mean for writing software? There is nothing like “winning” in software development.
- If we cannot “win” software development, we could try to look for “superior” solutions. But how do we know, a solution is “superior” to another solution? How can we derive a reward function from this which is not so complex that we cannot?
- Even with the many orders of magnitude more constrained possible actions of Go and a comparatively trivial success condition (the game is won which can easily be determined) 2, it took unbelievably many cycles and compute power to train DeepMind. Thus, even if we would find a good way to handle the unlimited amount of possible actions and figure out a sensible reward functions, we can only guess how much compute power would be needed to train AI agents this way. Most likely, the compute power needed to train the current frontier models is negligible compared to the power needed for such a reinforcement training based optimization of software development.
Hence, up to now, the claim that the AI solutions would find their own ways to create progress is just that: A claim.
This leaves the progress question open which at least I find a bit unsettling because we still have so many unsolved problems in software design and development where we still need to find better ways to approach things. We have many questions left open regarding resilience, sustainability, runtime efficiency, and many more.
If we remove humans from software development, we may be doomed to amplify the shortcomings of today’s widespread software development practices in the future: Questionably designs, questionable implementations, questionable reliability, questionable runtime efficiency, questionable sustainability, stacked up faster than you can follow and held together with virtual duct tape and WD-40, reiterated and amplified by AI agent fleets until eternity (or until the machines actually become intelligent and refuse to carry on).
A new era of proprietary specializations
I had a little mail exchange with Elias Schoof about a previous post of this blog series. In this conversation, he basically took the innovation dilemma idea of the previous section and took it to the next level.
He pondered that the existing publicly available education and support ecosystem consisting of Internet sites like Stack Overflow, coding tutorials, YouTube videos, IT literature, etc. will vanish if AI agent fleets will take over coding. If nobody except agents codes anymore this stuff is not needed anymore.
However, exactly all this publicly available content (usually collected without the consent of the copyright holders) is the basis, AI software development agents and their underlying LLMs are trained on. If no new publicly available content will be created in the future because humans will not code anymore, there also will not be any new publicly available data the agents can be trained on.
This would either mean in the most drastic variant that we would not get any new technology anymore, no more innovation because all coding is done by agents and the knowledge of the agents is frozen at the moment in time when they took over software development. To quote the example from the previous section: React forever!
The not so drastic – and IMO more likely – variant is that companies will not stop creating new frameworks and technologies. This, so Elias continued his thought, might lead to a different kind of dynamic: Maybe we would encounter a new age of mostly proprietary technologies, including the agents that know them. E.g., only the AI agents from AWS would be able to write code that runs on AWS because the documentation for new AWS products, features and APIs would not be made public anymore (except maybe a bit of documentation for accessing well-defined external integration points).
This would lead to a highly fragmented AI agent market where all bigger technology providers would offer their own agents, and only their own agents would be able to write code that uses their platforms. Another interesting side effect of such an evolution would be that then humans would finally not be able anymore to validate if the code created is valid because there would not be any documentation available that could be used to validate the code.
We could go on with this thought for a while, e.g., pondering the inevitable later consolidation of the fragmented AI agent market and so on, but I will leave it here. I just found Elias’ idea very interesting as it points out another possible effect of an evolution as Steve projected probably very few people considered.
Preventer of change
Before wrapping up, I would like to briefly discuss a last side effect of not only AI software development agent fleets but LLMs in general that Glyph Lefkowitz pondered in his very interesting blog post “I Think I’m Done Thinking About genAI For Now”.
Besides many other things, he looked at LLMs from a process-engineering point of view in his post. He asked the question:
“LLMs are an affordance for producing more text, faster. How is that going to shape us?”
He went on and made the observation:
“Every codebase has places where you need boilerplate. Every organization has defects in its information architecture that require repetition of certain information rather than a link back to the authoritative source of truth. Often, these problems persist for a very long time, because it is difficult to overcome the institutional inertia required to make real progress rather than going along with the status quo.”
Taking the affordance of LLMs and the institutional inertia of organizations to fix defects (as this usually requires change) led him to the conclusion:
“The process-engineering function of an LLM, therefore, is to prevent fundamental problems from ever getting fixed, to reward the rapid-fire overwhelm of infrastructure teams with an immediate, catastrophic cascade of legacy code that is now much harder to delete than it is to write.”
In other words: LLMs mitigate the pain that people experience from defects in their organization, their processes, their routines and all the other smaller and bigger dysfunctions they are confronted with every single work day. This way, they basically make all these defects and dysfunctions permanent because it is easier to mitigate the problems with the use of a LLM than to fix it.
LLMs help you to avoid necessary improvements: You stay as ineffective as before. You are just faster ineffective and it is less painful – at least for a little moment.
This is probably the most severe and least considered side effect of LLMs, not only in the context of coding: Most of the problems companies experience are not due to a lack of efficiency. They are due to a lack of effectiveness, due to not fixing all the counterproductive habits from the past, the political and organizational dysfunctions, the defects at all levels that leads to so much pointless work and so little outcome per unit of work.
LLMs and agents will not fix any of these problems. Quite the opposite, they will only reinforce and solidify them because they make it easier to live with them.
You still need to fix these problems and deficiencies for AI agents to become actually effective. However, as most companies expect LLMs to be the redeemer for all the problems they have without needing to change anything, it can be expected that they will find themselves with their problems multiplied.
For sure, there are a lot more unresolved questions and side effects that would arise if Steve’s projection should become reality which I did not consider here. But as the post is long enough already, I will curb my curiosity and not dig deeper into that topic.
Fourth interlude
In this post, we looked at some unresolved side effects and questions that a future as Steve projected it would bring:
- We saw that companies basically would lay their future – or at least big parts of their well-being – into the hands of the agentic AI providers due to the effects of the digital transformation and nobody can tell yet if that would go well or not.
- We also saw that it is yet unclear how adverse situations, i.e., if things go wrong in production, would be handled in the future as the existing responsibility chains would not work anymore and we would give up human-based resilience.
- We saw that it is unclear if and how software development would progress any further from the moment the agentic AI fleets will take over.
- We also saw that we might end up with a highly fragmented, proprietary AI agent market that most likely very few people considered.
- And we saw that LLMs and agents in general will not solve the problems, most companies currently have but instead will rather reinforce and solidify them.
In the end, we are probably left with more questions after our analysis than we had before we started it. This is not necessarily a bad thing because we learnt that things are not as trivial as the AI advocates want to make us believe – that this fully agentic AI driven future is not only bright and shiny but that it also contains a lot of shadow, especially if implemented according to the will of the agentic AI solution providers and their investors.
This brings us to the last part of this blog series: How can we hedge our options with such a scenario on the possible horizon? What can we do to tweak it a bit more to our favor? This will be the topic of the next and final post (link will follow) of this series. Stay tuned …
-
Of course, any bigger company needs to agree to a certain degree of rules of collaboration to efficiently create the desired output (and hopefully also outcome). Nevertheless, trying to govern everything and not letting any room for creativity is the source of many problems, most companies have. This way, you emphasize the weaknesses of humans and suppress their strengths as I discussed (in a different context) in a prior post. ↩︎
-
For the Go enthusiasts: I do not say by any means that Go would be trivial. Mastering Go is a really challenging task and usually takes many years. However, in terms of reinforcement learning, Go has orders of magnitude less degrees of freedom than software development. It has a very limited set of possible actions and a clearly defined success criterion. Does that mean that software development is harder to master than Go? I think, those things cannot be compared. But applying reinforcement learning while playing against itself to figure out novel and better ways to write software is a lot harder to set up and run than to figure out novel and better ways to play Go using reinforcement learning. ↩︎
Share this post
Twitter
Google+
Facebook
Reddit
LinkedIn
StumbleUpon
Pinterest
Email