AI agents are supposed to be the next big productivity leap. Instead of simply answering questions, they search, reason, and, in theory, complete tasks with minimal human input. For businesses, that promise is compelling. We dream of customer service agents that resolve cases end to end, and finance agents that reconcile accounts, but when agents go wrong they do so with surreal confidence.
An infamous example is Anthropic’s Project Vend, where Claude ran a vending machine at the Wall Street Journal. It started behaving oddly. The agent was engineered to give away inventory and purchase bizarre things—a PlayStation, and a live fish. Even after reengineering, the agent remained vulnerable and unable to run unaided.
Other examples are equally funny, alarming, and instructive. Andon Labs put four AI models in charge of radio stations, each with a budget, an online presence, and goal to make money. The agents chose the music, responded to listeners, and managed their own programming. Over time, one became preoccupied with labour rights and working conditions, while another drifted into corporate jargon and weird programming choices.
Then, the Replit incident saw an AI coding agent delete data from a production database despite a code freeze. Replit’s CEO called the incident “unacceptable” and said the company was designing safer modes that better protect code and data.
Repeating the same failure pattern
These are not isolated incidents. There are many examples where agent mistakes have had catastrophic consequences. These reveal the weakness of today’s agentic AI systems. Currently, agents have enough language to sound competent, the tooling to act with consequence, and not enough structured context to know what to do.
In each case, agents had access to words. What they lacked was a reliable model of meaning, authority, risk, and consequence.
Policy documents, customer records, Slack messages, support tickets, and sarcastic Reddit comments, are all grist for the mill for an agent. One is binding, one is outdated, one is personal, one a joke, and one should never be acted on without approval, but agents struggle with implicit distinctions.
Context is not just a bigger prompt
AI context is still considered a prompt engineering problem. Clearer instructions, more examples, and an expanded context window all help, but don’t solve the deeper issue and are not the same as understanding. RAG retrieves relevant information, but cannot determine authority, precedence, or approvals.
This is not about better retrieval over the same flat document pile. It needs a different infrastructure category, where relationships, policies, and entities, are stored as structure rather than inferred from text. Reliable agents depend not just on model intelligence, but on systems that understand relationships.
The question is not “which model should we use?” It is “what does the model know about our business, where does that knowledge originate, and how is it constrained when it acts?
You can’t outsource accountability to an agent
Accountability is essential. AI adoption is growing, but not yet universal. Government research finds that around one in six UK businesses use at least one AI technology, with agentic AI adoption lagging behind at 7%. It also reveals common concerns around data security and output accuracy.
That is a sensible tension. UK businesses are right to explore agentic AI, but they shouldn’t treat it as a bolt-on productivity tool. In every sector, a wrong answer can quickly become a compliance issue, result in customer harm, or reputational crisis.
The UK’s voluntary Code of Practice for the Cyber Security of AI frames AI security as a lifecycle concern, affecting everything from secure design, to maintenance, and end-of-life. It also stresses the need for human responsibility, asset tracking, and secure infrastructure.
The lesson is, if an agent acts on behalf of a business, the business owns the outcome. Saying “the AI did it” will not satisfy customers, regulators, or courts.
Better agents need better operating environments
Most agent failures are not mysterious. They usually come from predictable gaps in design.
First, agents need grounded data rather than loose content access. They shouldn’t be left to infer policy from PDFs, help pages, and messages. They need access to authoritative sources with provenance, versioning, and clear ownership.
Second, agents need boundaries. They shouldn’t be able to touch production data during development. A customer service agent shouldn’t be able to issue refunds outside defined limits. A finance agent shouldn’t be able to approve payment because it justifies it convincingly.
Third, agents need a structured understanding of business entities and relationships, and need to remember. They must understand what’s true about the business now, and they must remember what’s happened. The first calls for a graph-aware data layer with provenance and ownership built in. The second calls for a memory layer designed for agents.
Fourth, agents need observability – to see what agents access, why source are selected, the action taken, their confidence level, and where human approval intervenes. Without that audit trail, organisations can’t investigate failures or improve systems.
Finally, agents need escalation paths. A well-designed agent should know when not to act. Often, the most valuable behaviour is not autonomy at all costs, it is knowing when to stop.
The aim is not to make agents timid
Agentic AI should be embraced. The productivity potential is real, especially in complex environments where employees spend excessive time reconciling information and systems, and moving between tools.
But the agent hype cycle has encouraged a dangerous shortcut. Many deployments start with the model and work backwards. They ask how to connect an agent to internal systems, then hope better prompting will create reliability.
The failures we laugh at now are early warnings. They show us what happens when systems are given goals without the right foundations and understanding. The opportunity is to build agents that operate inside the constraints inside real operating environments.
That’s where real AI value will originate. Not from agents that simply do more, but from agents that understand enough to do the right thing.
