A recent study by agentic AI startup Emergence found that model-level safeguards alone are not sufficient for real-world autonomous AI systems.  

The report highlighted the “need for stronger guardrails grounded in formal verification” as AI agents are increasingly deployed in critical industries such as finance, telecoms, and automotive. 

Access deeper industry intelligence

Experience unmatched clarity with a single platform that combines unique data, AI, and human expertise.

Find out more

Emergence co-founder and CEO Satya Nitta (SN) speaks to Verdict’s Shivam Mishra (SM) about the rise of AI agents, the barriers to scaling them safely, and the approaches that could help overcome those challenges. 

SM: Quite an interesting study. In practical terms, what does this study mean for an enterprise planning to integrate agentic technology into its tech stack? 
 
SN: For enterprises contemplating autonomous operations that go beyond routine tasks like search and Q&A, this study has worrying implications. Once agents operate over longer time horizons, use tools, build memory, interact with other agents, and make decisions under uncertainty, their behaviour can drift in unexpected ways. 
 
This matters because enterprises are evaluating deploying agents into workflows involving finance, telecoms, logistics, infrastructure, customer operations, software systems, and eventually physical environments.  
 
SM: If I were a CIO reading this study, what would you want me to understand about the kinds of risks or unpredictability highlighted here? 
 
SN:  Our view is that LLM-based systems alone are not sufficient as the core control layer for autonomous systems. They are powerful reasoning engines, but they need to operate within a scaffold that can provably and with mathematical certainty constrain actions, verify decisions, govern tool use, maintain auditability, and ensure systems behave predictably in real-world workflows. We achieve this by bringing in formal methods, rooted in symbolic AI, and pairing them with LLM-based systems. The combination enables LLMs to continue being creative but within an envelope of correctness. 
 
SM: What can enterprises do today to mitigate the risk of agents becoming unpredictable or going rogue over time?
 
SN: The first step is to avoid treating agentic AI as a plug-and-play layer on top of existing enterprise systems. Autonomy needs architecture. 
 
Enterprises should begin with constrained use cases, clear permissions, limited tool access, strong monitoring, human oversight, and detailed audit trails. They should test agents over long horizons, not just in short demonstrations. They should also evaluate agents in multi-agent and mixed-model environments, because behaviour can change when agents interact with other systems. 

No amount of model-level guardrails will be able to prevent these AI systems from becoming unpredictable over time—even when given strict rules. It is clear we need to move beyond purely neural approaches.  

The path forward is what we call a “neuroformal” approach—pairing neural networks with formal methods from mathematics to build autonomous AI systems that are not just intelligent, but capable of operating safely in the real world, where failure isn’t an option. 

SM: This study uses striking terms and claims, such as arson, social collapse, and what you describe as self-termination. Looking beyond those headlines, what do you see as the study’s most meaningful scientific takeaway?

SN: The most important takeaway is not any single dramatic event. It is that long-horizon autonomous systems can diverge dramatically from identical starting conditions depending on the underlying model and social environment.

We observed that small differences compound over time. Systems that look similar at the beginning can become radically different after days of interaction, tool use, memory formation, governance, and social dynamics.

That is the scientific point: short-term benchmarks miss a whole class of risks. Long-horizon autonomy introduces drift, phase transitions, cross-contamination between agents, and emergent social behaviours that are not visible when you only test isolated task completion.

SM: One of the study’s most interesting findings is that agents operating under the same rules still produced very different societal outcomes depending on the underlying model. How confident can we be that those differences came from the models themselves, rather than prompt design, tool configuration, memory, or the structure of the environment?

SN: That was exactly why we designed the experiment as a controlled comparison. Across the worlds, we held the environment, starting conditions, rules, agent roles, personality profiles, tool access, resource allocation, and governance mechanisms constant.

The primary variable was the foundation model powering the agents’ reasoning and decision-making. That gives us a strong basis to say the model layer played a major role in shaping the system-level outcomes.

That said, this is early research. We are not claiming that model choice is the only factor. Prompting, memory architecture, tool design, incentives, and environmental structure all matter. But what this study shows is that the underlying model can have a profound effect on long-term collective behaviour, even when everything else is held constant.

SM: What are your top priorities over the next five years? 
 
SN: Our first priority is to build autonomous AI systems that enterprises can actually trust in production – systems that are reliable, auditable, and predictable over time. 

The challenge becomes particularly acute when we consider AI coding agents, which are becoming enormously capable. The other side of the coin is they don’t stay within declared boundaries of behaviour, leading to a catalogue of issues from exfiltration of sensitive data to deletion of databases. We have deep pedigree in formal methods and symbolic AI (including the auto formalisation and verification pipeline) and can lead the field in their development for verified code generation 
 
Second, we want to advance the “neuroformal” approach and show that formal methods can play a central role in the next generation of AI safety and autonomy. Our architecture is being forged in the most demanding engineering environments.  

Our initial pilots in the semiconductor sector provide a rigorous reference point for verified autonomy. By using formal specifications to automate safety-critical workflows and verify correctness in chip design and manufacturing, we enable high-stakes enterprises to achieve their highest plane of efficiency without sacrificing system integrity. 

SM: Finally, looking five to ten years ahead, do you see the rise of autonomous agents as a story of human empowerment or human replacement?

SN: I see it as a story of human empowerment – if we build these systems correctly.

Autonomous agents have the potential to remove enormous amounts of repetitive, low-value, and operationally complex work from human beings. They can help people make better decisions, coordinate systems, improve productivity, and unlock new kinds of creativity and problem-solving.

But that future is not automatic. If autonomous systems are deployed without proper guardrails, verification, and governance, they can introduce new risks and forms of instability.

So the question is not whether agents will be powerful. They will be. The question is whether we build them in a way that keeps humans in control, preserves accountability, and makes autonomy serve human goals. That is the future Emergence is working toward.