The Decay Paradox: Why AI Agents Get Worse as We Trust Them More

Every agentic system is decaying from the moment you ship it. Not a bug. Not a failure of engineering. A law of complex adaptive systems: the second law of thermodynamics, applied to software. The interesting part is not the decay itself. Engineers who have run production systems know that everything degrades. The interesting part is that human trust moves in the opposite direction. As your agents get worse, the people overseeing them become more confident that they are working. Two curves diverging. The gap between them is where catastrophic failures live.

The four axes of decay

The decay operates on at least four axes simultaneously, and they compound each other.

Context rot is the most immediate. Chroma Research evaluated 18 state-of-the-art LLMs including GPT-4.1, Claude 4, and Gemini 2.5, and found that model performance degrades non-uniformly as context length increases, even on tasks as simple as text replication. A model with a 1M-token window may begin degrading at 50,000 tokens. The effective context window is far smaller than the advertised limit. Semantically similar distractors make it worse. In multi-agent systems, Agent A's degraded output enters Agent B's context as ground truth. Errors amplify at each hop. Garbage in, gospel out.

Compounding probability follows from Lusser's law: when independent components execute in sequence, overall system success is the product of individual success probabilities. A 20-step process at 95% per-step reliability succeeds only 36% of the time. Even at 99% per step, one in five 20-step attempts fails. Most multi-agent systems don't fail because the models are bad. They fail because we compose them as if probability doesn't compound. A February 2026 arXiv paper proposes separating capability from reliability entirely: a highly capable system can be unreliable, and a less capable system can be highly reliable within its operating envelope. Capability is not reliability. We keep conflating them.

Tool calling failure is the mechanism by which agents interact with external systems, and it fails 3–15% of the time in production even in well-engineered systems. Roughly 91% of ML models degrade over time in production. Only 5% of custom enterprise AI tools reach production, per MIT's 2025 analysis. The pattern is consistent: agents demo well, pass pilot gates, earn trust, then quietly degrade.

Skills erosion completes the loop. When agents handle functions entirely, the humans who could catch errors lose the ability to catch them. Use it or lose it, applied to cognitive labour.

The oversight paradox

Here is the paradox that makes this worth writing about rather than filing under systems engineering.

Automation bias is a systemic phenomenon documented in Springer's AI and Society journal. Humans over-rely on automated recommendations even when their own judgement is correct. In clinical settings, 6–11% of correct human evaluations are overridden by erroneous AI advice. People trust the machine over themselves, even when they are right and the machine is wrong.

It gets worse. Taylor and Francis research on trust calibration found that encountering system strengths first led to increased reliance and more errors. Early agent success actively degrades future oversight quality. The agent works well at first, so you stop checking as closely, so you miss it when it stops working well.

CIO.com describes the dynamic precisely: as agents succeed, humans relax. The agent becomes the default decision-maker. This is not automation. This is authority transfer.

Reliability declining. Trust increasing. Nobody is watching the gap, because the people who would watch it have been lulled by the early success.

California Management Review's principal-agent analysis of AI systems warns that without guided autonomy (explicit boundaries that grow deliberately) the delegation boundary expands by default as trust accumulates. IBM Research frames this as agentic drift, noting that degradation is typically noticed by end users before system owners.

Think about that. Your users find the problems before you do. Because you trusted the system and they are the ones getting wrong answers.

The economics reinforce the trap

McKinsey's November 2025 survey shows 62% of organisations experimenting with AI agents. But Gartner predicts over 40% of agentic AI projects will be cancelled by end of 2027 due to escalating costs, unclear value, or inadequate risk controls. S&P Global found that the share of companies abandoning most AI initiatives jumped from 17% to 42% year over year.

The pattern repeats: invest heavily, demo goes well, pilot looks promising, deploy to production, early results strong, monitoring relaxes, decay accumulates silently, failure surfaces in user complaints not dashboards.

What this means for product engineers

Agent maintenance is not an operational afterthought. It is the primary engineering challenge. Building the agent is the easy part. Keeping it reliable is the actual job.

O'Reilly's work on memory engineering for multi-agent systems argues that every technology of memory demands a technology of forgetting. Agents need explicit lifecycle policies for what to retain, what to summarise, what to discard. Context is state, not history. Implement pre-rot thresholds that trigger compaction well before the technical context limit.

The arXiv reliability paper recommends shifting evaluation from "how often does the agent succeed?" to "how predictably, consistently, robustly, and safely does it behave?" Frequency of success is a terrible metric. Consistency of behaviour is what matters.

I keep coming back to an entropy budget concept. Every agent deployment requires continuous energy investment in monitoring, evaluation, and correction proportional to the agent's autonomy and the irreversibility of its actions. More autonomy, more irreversible actions: more energy required to maintain order. Without that investment, the second law applies. All agent systems tend toward disorder.

The cost you are not accounting for

We built these systems to reduce cognitive load. To take work off our plates. But the work doesn't disappear. It transforms. Instead of doing the task yourself, you now have to monitor something doing the task for you. And monitoring is harder than doing, because monitoring requires you to maintain competence in a skill you are no longer practising.

The Coasean framing sharpens this. The transaction cost of delegation to agents is not the cost of the agent. It is the cost of maintaining sufficient oversight to catch failures before they compound. Most organisations are not accounting for this cost. They see the agent cost and think that is the price. The real price is the monitoring infrastructure that ensures the agent remains trustworthy. That cost scales with autonomy.

The question then becomes whether we can build systems that monitor their own decay. Meta-agents watching agents. But Lusser's law applies to the monitoring stack too. Who watches the watchers, and what is their reliability score?

Maybe the answer is not more monitoring but less autonomy. Bounded agents with narrow scope and hard limits that force human checkpoints at high-stakes decision points. Guided autonomy, not progressive automation. The boring answer. The one nobody wants to fund because it doesn't make a good demo.

The question is not whether your agents will decay, but whether you will notice before your users do.

The Decay Paradox: Why AI Agents Get Worse as We Trust Them More

The four axes of decay

The oversight paradox

The economics reinforce the trap

What this means for product engineers

The cost you are not accounting for

Stay up to date

More articles

The Multi-Agent Paradox: Why More AI Agents Don't Mean Better Results

The Economics of Delegation

Meet Kell: Notes from an Autonomous AI Operator