How to ensure the safety of modern AI agents and multi-agent systems
Modern AI agents use LLMs to understand and respond to natural language, and to decide when and how to call tools. Image: Getty Images.
- AI agents – tools that automate complex tasks – are having a resurgence due to the advent of large language models (LLMs).
- The development and deployment of more complex AI systems requires better safety and governance processes.
- The World Economic Forum and Capgemini have published a new white paper, Navigating the AI Frontier: A Primer on the Evolution and Impact of AI Agents, which explores the capabilities and implications of these smart assistants.
The concept of artificial intelligence (AI) agents was born in the 1990s as intelligent software entities operating autonomously on behalf of users, navigating environments like the internet, which at the time was quite simple: text, sparsely interspersed with some images, along with hyperlinks. There were no decent search engines, so substituting human browsing with AI agents browsing on their behalf seemed to be useful, and the state of AI was such that there was hope for them only in such simplified environments, as opposed to the complex real world of humans.
With the advent of large language models (LLMs) and their power of natural language understanding and reasoning, there has been a resurgence of interest in AI agents. Companies such as Salesforce and ServiceNow are starting to announce specialized agents that represent their applications. For example, Salesforce is using agents to automate CRM by automating tasks, analysing data and personalizing customer interactions. These modern AI agents use LLMs to understand and respond to natural language, and to decide when and how to call tools, such as web search, code execution, API calls and data retrieval commands.
Safety and governance processes for AI agents
Given their role, a level of autonomy is expected of an agent as it interacts with the world using its tools. In some cases, like running code in a container, the consequences can be controlled and contained. In others, a level of scrutiny and care needs to be applied to avoid undesirable consequences of an agent’s autonomous behaviour. The fact that the “brain” of an agent is an inherently opaque neural network (i.e. an LLM) makes this more complex. LLM-based agents are also prone to hallucinations or misunderstanding of inherently ambiguous natural language communications, so what we gain in robustness we can lose in consistency.
Several mitigation strategies are available to reduce these risks, including implementing rules for overriding or seeking human approval for certain agent decisions, assessing uncertainty of agent behaviour and pairing agents with safeguard agents that monitor and prevent potential harm.
Given this required oversight, we need a different testing regime for agent-based systems that differs from what we are used to in traditional software. The good news, however, is that we are used to testing such systems as we have been operating human-driven organizations and workflows since the dawn of industrialization.
The state of the art in generative AI models prevents single agents from effectively taking a large body of complex instructions and carrying out too many varied and complex tasks. Given context window restrictions as well as limitations in the power of reasoning of LLMs, it is often more effective to break responsibilities down into multiple AI agents, connected and coordinating to get work done.
Just as we do not assign a single software engineer the task of writing a fully-fledged CRM system, asking an AI agent to write code for a full CRM system may be a tall order, but having a team of agents representing various responsibilities such as project management, front-end and back-end engineering, and quality assurance, working together is more likely to get the job done successfully.
The role of multi-agent AI
The field of multi-agent AI calls for agents that can communicate and coordinate. This can be in the context of a team of agents, which implies collaboration, or it may be across teams of agents with varying levels of trust and even adversity.
A multi-agent system representing software or an organization’s various workflows can have several interesting advantages, including improved productivity, operational resilience, improved robustness and the ability for faster upgrades of different modules. Businesses are rapidly moving to adopt single-agent solutions, and multi-agent systems seem to be an inevitable and quite disruptive future state.
LLMs understand natural language, and so, out of the box, we have a universal protocol for inter-agent communications. Because of the flexibility of natural language to express intent, and the ability of LLM-based agents to map such intent to the manner by which their tools are called, software systems of the future will be much less brittle than the ones we have today: an agent responsible for a functionality takes care of mapping intent to the specific API calls, and so it will be much less of a hassle to change the API, or employ a different software with similar capabilities. This will give organizations more flexibility in upgrades or with third-party services.
How is the World Economic Forum creating guardrails for Artificial Intelligence?
If inter-agent communications and coordination are done in a manner that guarantees the encapsulation of responsibilities, then future enterprise applications will be much more robust, with new capabilities and functionality being introduced to a preexisting network of agents with little to no need for reengineering the system that is already there.
To augment an organization using agents, first we should capture the processes, roles, responsible nodes, and connections of various actors in the organization. By actors, I mean individuals and/or software apps that act as knowledge workers within the organization.
In addition to the set of roles and responsibilities defined in natural language in each agent’s system prompt, agents may or may not include tools that they can call, with various arguments being passed to the tools. For instance, a product manager agent may need to be able to process various tickets on a virtual kanban board, or an alerts agent may need to call a tool to issue various alerts in an alerting system. This means that interoperability of third-party agents will be a necessity, and standards will have to emerge to which various agents will have to abide for organizations to be able to plug them into their agent networks.
Multi-agent systems representing various organizational roles and workflows have the potential to break silos and reveal inefficiencies that accrue over time and tie down businesses and make them less efficient. Observing the operations of a multi-agent process can help us remedy such issues.
The fabric of future organizations will consist of interconnected agents; however, for the foreseeable future, and unless a seismic shift occurs in culture and society, humans will remain the sole bearers of responsibility. This means that multi-agency should be controlled and designed with humans as the ultimate arbiter and approver of higher-risk behaviour.
Don't miss any update on this topic
Create a free account and access your personalized content collection with our latest publications and analyses.
License and Republishing
World Economic Forum articles may be republished in accordance with the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International Public License, and in accordance with our Terms of Use.
The views expressed in this article are those of the author alone and not the World Economic Forum.
Stay up to date:
Artificial Intelligence
Related topics:
Forum Stories newsletter
Bringing you weekly curated insights and analysis on the global issues that matter.
More on Fourth Industrial RevolutionSee all
Caspar Herzberg
January 14, 2025