
AI companies offer a vision of AI agents handling ever more complicated tasks for you. VichanChairat/DigitalVision Vectors via Getty Images
Interacting with AI chatbots like ChatGPT can be fun and sometimes useful, but the next level of everyday AI goes beyond answering questions: AI agents carry out tasks for you.
Major technology companies, including OpenAI, Microsoft, Google and Salesforce, have recently released or announced plans to develop and release AI agents. They claim these innovations will bring newfound efficiency to technical and administrative processes underlying systems used in health care, robotics, gaming and other businesses.
Simple AI agents can be taught to reply to standard questions sent over email. More advanced ones can book airline and hotel tickets for transcontinental business trips. Google recently demonstrated Project Mariner to reporters, a browser extension for Chrome that can reason about the text and images on your screen.
In the demonstration, the agent helped plan a meal by adding items to a shopping cart on a grocery chain’s website, even finding substitutes when certain ingredients were not available. A person still needs to be involved to finalize the purchase, but the agent can be instructed to take all of the necessary steps up to that point.
In a sense, you are an agent. You take actions in your world every day in response to things that you see, hear and feel. But what exactly is an AI agent? As a computer scientist, I offer this definition: AI agents are technological tools that can learn a lot about a given environment, and then – with a few simple prompts from a human – work to solve problems or perform specific tasks in that environment.
Rules and goals
A smart thermostat is an example of a very simple agent. Its ability to perceive its environment is limited to a thermometer that tells it the temperature. When the temperature in a room dips below a certain level, the smart thermostat responds by turning up the heat.
A familiar predecessor to today’s AI agents is the Roomba. The robot vacuum cleaner learns the shape of a carpeted living room, for instance, and how much dirt is on the carpet. Then it takes action based on that information. After a few minutes, the carpet is clean.
The smart thermostat is an example of what AI researchers call a simple reflex agent. It makes decisions, but those decisions are simple and based only on what the agent perceives in that moment. The robot vacuum is a goal-based agent with a singular goal: clean all of the floor that it can access. The decisions it makes – when to turn, when to raise or lower brushes, when to return to its charging base – are all in service of that goal.
A goal-based agent is successful merely by achieving its goal through whatever means are required. Goals can be achieved in a variety of ways, however, some of which could be more or less desirable than others.
Many of today’s AI agents are utility based, meaning they give more consideration to how to achieve their goals. They weigh the risks and benefits of each possible approach before deciding how to proceed. They are also capable of considering goals that conflict with each other and deciding which one is more important to achieve. They go beyond goal-based agents by selecting actions that consider their users’ unique preferences.
Making decisions, taking action
When technology companies refer to AI agents, they aren’t talking about chatbots or large language models like ChatGPT. Though chatbots that provide basic customer service on a website technically are AI agents, their perceptions and actions are limited. Chatbot agents can perceive the words that a user types, but the only action they can take is to reply with text that hopefully offers the user a correct or informative response.
The AI agents that AI companies refer to are significant advances over large language models like ChatGPT because they possess the ability to take actions on behalf of the people and companies who use them.
OpenAI says agents will soon become tools that people or businesses will leave running independently for days or weeks at a time, with no need to check on their progress or results. Researchers at OpenAI and Google DeepMind say agents are another step on the path to artificial general intelligence or “strong” AI – that is, AI that exceeds human capabilities in a wide variety of domains and tasks.
The AI systems that people use today are considered narrow AI or “weak” AI. A system might be skilled in one domain – chess, perhaps – but if thrown into a game of checkers, the same AI would have no idea how to function because its skills wouldn’t translate. An artificial general intelligence system would be better able to transfer its skills from one domain to another, even if it had never seen the new domain before.
Worth the risks?
Are AI agents poised to revolutionize the way humans work? This will depend on whether technology companies can prove that agents are equipped not only to perform the tasks assigned to them, but also to work through new challenges and unexpected obstacles when they arise.
Uptake of AI agents will also depend on people’s willingness to give them access to potentially sensitive data: Depending on what your agent is meant to do, it might need access to your internet browser, your email, your calendar and other apps or systems that are relevant for a given assignment. As these tools become more common, people will need to consider how much of their data they want to share with them.
A breach of an AI agent’s system could cause private information about your life and finances to fall into the wrong hands. Are you OK taking these risks if it means that agents can save you some work?
What happens when AI agents make a poor choice, or a choice that its user would disagree with? Currently, developers of AI agents are keeping humans in the loop, making sure people have an opportunity to check an agent’s work before any final decisions are made. In the Project Mariner example, Google won’t let the agent carry out the final purchase or accept the site’s terms of service agreement. By keeping you in the loop, the systems give you the opportunity to back out of any choices made by the agent that you don’t approve.
Like any other AI system, an AI agent is subject to biases. These biases can come from the data that the agent is initially trained on, the algorithm itself, or in how the output of the agent is used. Keeping humans in the loop is one method to reduce bias by ensuring that decisions are reviewed by people before being carried out.
The answers to these questions will likely determine how popular AI agents become, and depend on how much AI companies can improve their agents once people begin to use them.
Brian O’Neill, Associate Professor of Computer Science, Quinnipiac University


