Agent#

“An autonomous agent is a system situated within and a part of an environment that senses that environment and acts on it, over time, in pursuit of its own agenda and so as to effect what it senses in the future.”

—Franklin and Graesser (1997)

Alongside the well-known RAGs, agents [1] are another popular family of LLM applications. What makes agents stand out is their ability to reason, plan, and act via accessible tools. When it comes to implementation, AdalFlow has simplified it down to a generator that can use tools, taking multiple steps (sequential or parallel) to complete a user query.

Design#

We will first introduce ReAct [2], a general paradigm for building agents with a sequential of interleaving thought, action, and observation steps.

  • Thought: The reasoning behind taking an action.

  • Action: The action to take from a predefined set of actions. In particular, these are the tools/functional tools we have introduced in tools.

  • Observation: The simplest scenario is the execution result of the action in string format. To be more robust, this can be defined in any way that provides the right amount of execution information for the LLM to plan the next step.

Prompt and Data Models#

DEFAULT_REACT_AGENT_SYSTEM_PROMPT is the default prompt for React agent’s LLM planner. We can categorize the prompt template into four parts:

  1. Task description

This part is the overall role setup and task description for the agent.

task_desc = r"""You are a helpful assistant.
Answer the user's query using the tools provided below with minimal steps and maximum accuracy.

Each step you will read the previous Thought, Action, and Observation(execution result of the action) and then provide the next Thought and Action."""
  1. Tools, output format, and example

This part of the template is exactly the same as how we were calling functions in the tools. The output_format_str is generated by FunctionExpression via JsonOutputParser. It includes the actual output format and examples of a list of FunctionExpression instances. We use thought and action fields of the FunctionExpression as the agent’s response.

tools = r"""{% if tools %}
<TOOLS>
{% for tool in tools %}
{{ loop.index }}.
{{tool}}
------------------------
{% endfor %}
</TOOLS>
{% endif %}
{{output_format_str}}"""
  1. Task specification to teach the planner how to “think”.

We provide more detailed instruction to ensure the agent will always end with ‘finish’ action to complete the task. Additionally, we teach it how to handle simple queries and complex queries.

  • For simple queries, we instruct the agent to finish with as few steps as possible.

  • For complex queries, we teach the agent a ‘divide-and-conquer’ strategy to solve the query step by step.

task_spec = r"""<TASK_SPEC>
- For simple queries: Directly call the ``finish`` action and provide the answer.
- For complex queries:
   - Step 1: Read the user query and potentially divide it into subqueries. And get started with the first subquery.
   - Call one available tool at a time to solve each subquery/subquestion. \
   - At step 'finish', join all subqueries answers and finish the task.
Remember:
- Action must call one of the above tools with name. It can not be empty.
- You will always end with 'finish' action to finish the task. The answer can be the final answer or failure message.
</TASK_SPEC>"""

We put all these three parts together to be within the <SYS></SYS> tag.

  1. Agent step history.

We use StepOutput to record the agent’s step history, including:

  • action: This will be the FunctionExpression instance predicted by the agent.

  • observation: The execution result of the action.

In particular, we format the steps history after the user query as follows:

step_history = r"""User query:
{{ input_str }}
{# Step History #}
{% if step_history %}
<STEPS>
{% for history in step_history %}
Step {{ loop.index }}.
"Thought": "{{history.action.thought}}",
"Action": "{{history.action.action}}",
"Observation": "{{history.observation}}"
------------------------
{% endfor %}
</STEPS>
{% endif %}
You:"""

Tools#

In addition to the tools provided by users, by default, we add a new tool named finish to allow the agent to stop and return the final answer.

def finish(answer: str) -> str:
   """Finish the task with answer."""
   return answer

Simply returning a string might not fit all scenarios, and we might consider allowing users to define their own finish function in the future for more complex cases.

Additionally, since the provided tools cannot always solve user queries, we allow users to configure if an LLM model should be used to solve a subquery via the add_llm_as_fallback parameter. This LLM will use the same model client and model arguments as the agent’s planner. Here is our code to specify the fallback LLM tool:

_additional_llm_tool = (
   Generator(model_client=model_client, model_kwargs=model_kwargs)
   if self.add_llm_as_fallback
   else None
)

def llm_tool(input: str) -> str:
   """I answer any input query with llm's world knowledge. Use me as a fallback tool or when the query is simple."""
   # use the generator to answer the query
   try:
         output: GeneratorOutput = _additional_llm_tool(
            prompt_kwargs={"input_str": input}
         )
         response = output.data if output else None
         return response
   except Exception as e:
         log.error(f"Error using the generator: {e}")
         print(f"Error using the generator: {e}")

   return None

React Agent#

We define the class ReActAgent to put everything together. It will orchestrate two components:

  • planner: A Generator that works with a JsonOutputParser to parse the output format and examples of the function calls using FunctionExpression.

  • ToolManager: Manages a given list of tools, the finish function, and the LLM tool. It is responsible for parsing and executing the functions.

Additionally, it manages step_history as a list of StepOutput instances for the agent’s internal state.

Name

Description

__init__(self, tools: List[Union[Callable, AsyncCallable, FunctionTool]] = [], max_steps: int = 10, add_llm_as_fallback: bool = True, examples: List[FunctionExpression] = [], *, model_client: ModelClient, model_kwargs: Dict = {}, template: Optional[str] = None)

Initialize the ReActAgent with the specified tools, maximum steps, fallback option, examples, model client, model arguments, and template if you want to customize the prompt.

call(self, input: str, prompt_kwargs: Optional[Dict] = {}, model_kwargs: Optional[Dict] = {}) -> Any

Prompt the agent with an input query and process the steps to generate a response.

Agent In Action#

We will set up two sets of models, llama3-70b-8192` by Groq and gpt-3.5-turbo` by OpenAI, to test two queries. For comparison, we will compare these with a vanilla LLM response without using the agent. Here are the code snippets:

from adalflow.components.agent import ReActAgent
from adalflow.core import Generator, ModelClientType, ModelClient
from adalflow.utils import setup_env

setup_env()


# Define tools
def multiply(a: int, b: int) -> int:
   """
   Multiply two numbers.
   """
   return a * b

def add(a: int, b: int) -> int:
   """
   Add two numbers.
   """
   return a + b

def divide(a: float, b: float) -> float:
   """
   Divide two numbers.
   """
   return float(a) / b

llama3_model_kwargs = {
   "model": "llama3-70b-8192",  # llama3 70b works better than 8b here.
   "temperature": 0.0,
}
gpt_model_kwargs = {
   "model": "gpt-3.5-turbo",
   "temperature": 0.0,
}


def test_react_agent(model_client: ModelClient, model_kwargs: dict):
   tools = [multiply, add, divide]
   queries = [
      "What is the capital of France? and what is 465 times 321 then add 95297 and then divide by 13.2?",
      "Give me 5 words rhyming with cool, and make a 4-sentence poem using them",
   ]
   # define a generator without tools for comparison

   generator = Generator(
      model_client=model_client,
      model_kwargs=model_kwargs,
   )

   react = ReActAgent(
      max_steps=6,
      add_llm_as_fallback=True,
      tools=tools,
      model_client=model_client,
      model_kwargs=model_kwargs,
   )
   # print(react)

   for query in queries:
      print(f"Query: {query}")
      agent_response = react.call(query)
      llm_response = generator.call(prompt_kwargs={"input_str": query})
      print(f"Agent response: {agent_response}")
      print(f"LLM response: {llm_response}")
      print("")

The structure of React, including the initialization arguments and two major components: tool_manager and planner, is shown below.

         

ReActAgent(
   max_steps=6, add_llm_as_fallback=True,
   (tool_manager): ToolManager(Tools: [FunctionTool(fn: , async: False, definition: FunctionDefinition(func_name='multiply', func_desc='multiply(a: int, b: int) -> int\n\n    Multiply two numbers.\n    ', func_parameters={'type': 'object', 'properties': {'a': {'type': 'int'}, 'b': {'type': 'int'}}, 'required': ['a', 'b']})), FunctionTool(fn: , async: False, definition: FunctionDefinition(func_name='add', func_desc='add(a: int, b: int) -> int\n\n    Add two numbers.\n    ', func_parameters={'type': 'object', 'properties': {'a': {'type': 'int'}, 'b': {'type': 'int'}}, 'required': ['a', 'b']})), FunctionTool(fn: , async: False, definition: FunctionDefinition(func_name='divide', func_desc='divide(a: float, b: float) -> float\n\n    Divide two numbers.\n    ', func_parameters={'type': 'object', 'properties': {'a': {'type': 'float'}, 'b': {'type': 'float'}}, 'required': ['a', 'b']})), FunctionTool(fn: .llm_tool at 0x11384b740>, async: False, definition: FunctionDefinition(func_name='llm_tool', func_desc="llm_tool(input: str) -> str\nI answer any input query with llm's world knowledge. Use me as a fallback tool or when the query is simple.", func_parameters={'type': 'object', 'properties': {'input': {'type': 'str'}}, 'required': ['input']})), FunctionTool(fn: .finish at 0x11382fa60>, async: False, definition: FunctionDefinition(func_name='finish', func_desc='finish(answer: str) -> str\nFinish the task with answer.', func_parameters={'type': 'object', 'properties': {'answer': {'type': 'str'}}, 'required': ['answer']}))], Additional Context: {})
   (planner): Generator(
      model_kwargs={'model': 'llama3-70b-8192', 'temperature': 0.0},
      (prompt): Prompt(
         template: 
         {# role/task description #}
         You are a helpful assistant.
         Answer the user's query using the tools provided below with minimal steps and maximum accuracy.
         {# REACT instructions #}
         Each step you will read the previous Thought, Action, and Observation(execution result of the action) and then provide the next Thought and Action.
         {# Tools #}
         {% if tools %}
         
         You available tools are:
         {# tools #}
         {% for tool in tools %}
         {{ loop.index }}.
         {{tool}}
         ------------------------
         {% endfor %}
         
         {% endif %}
         {# output format and examples #}
         
         {{output_format_str}}
         
         
         {# Task specification to teach the agent how to think using 'divide and conquer' strategy #}
         - For simple queries: Directly call the ``finish`` action and provide the answer.
         - For complex queries:
            - Step 1: Read the user query and potentially divide it into subqueries. And get started with the first subquery.
            - Call one available tool at a time to solve each subquery/subquestion. \
            - At step 'finish', join all subqueries answers and finish the task.
         Remember:
         - Action must call one of the above tools with name. It can not be empty.
         - You will always end with 'finish' action to finish the task. The answer can be the final answer or failure message.
         
         
         -----------------
         User query:
         {{ input_str }}
         {# Step History #}
         {% if step_history %}
         
         {% for history in step_history %}
         Step {{ loop.index }}.
         "Thought": "{{history.action.thought}}",
         "Action": "{{history.action.action}}",
         "Observation": "{{history.observation}}"
         ------------------------
         {% endfor %}
         
         {% endif %}
         You:, prompt_kwargs: {'tools': ['func_name: multiply\nfunc_desc: "multiply(a: int, b: int) -> int\\n\\n    Multiply two numbers.\\n    "\nfunc_parameters:\n  type: object\n  properties:\n    a:\n      type: int\n    b:\n      type: int\n  required:\n  - a\n  - b\n', 'func_name: add\nfunc_desc: "add(a: int, b: int) -> int\\n\\n    Add two numbers.\\n    "\nfunc_parameters:\n  type: object\n  properties:\n    a:\n      type: int\n    b:\n      type: int\n  required:\n  - a\n  - b\n', 'func_name: divide\nfunc_desc: "divide(a: float, b: float) -> float\\n\\n    Divide two numbers.\\n    "\nfunc_parameters:\n  type: object\n  properties:\n    a:\n      type: float\n    b:\n      type: float\n  required:\n  - a\n  - b\n', "func_name: llm_tool\nfunc_desc: 'llm_tool(input: str) -> str\n\n  I answer any input query with llm''s world knowledge. Use me as a fallback tool\n  or when the query is simple.'\nfunc_parameters:\n  type: object\n  properties:\n    input:\n      type: str\n  required:\n  - input\n", "func_name: finish\nfunc_desc: 'finish(answer: str) -> str\n\n  Finish the task with answer.'\nfunc_parameters:\n  type: object\n  properties:\n    answer:\n      type: str\n  required:\n  - answer\n"], 'output_format_str': 'Your output should be formatted as a standard JSON instance with the following schema:\n```\n{\n    "thought": "Why the function is called (Optional[str]) (optional)",\n    "action": "FuncName() Valid function call expression. Example: \\"FuncName(a=1, b=2)\\" Follow the data type specified in the function parameters.e.g. for Type object with x,y properties, use \\"ObjectType(x=1, y=2) (str) (required)"\n}\n```\nExamples:\n```\n{\n    "thought": "I have finished the task.",\n    "action": "finish(answer=\\"final answer: \'answer\'\\")"\n}\n________\n```\n-Make sure to always enclose the JSON output in triple backticks (```). Please do not add anything other than valid JSON output!\n-Use double quotes for the keys and string values.\n-DO NOT mistaken the "properties" and "type" in the schema as the actual fields in the JSON output.\n-Follow the JSON formatting conventions.'}, prompt_variables: ['input_str', 'tools', 'step_history', 'output_format_str']
      )
      (model_client): GroqAPIClient()
      (output_processors): JsonOutputParser(
         data_class=FunctionExpression, examples=[FunctionExpression(thought='I have finished the task.', action='finish(answer="final answer: \'answer\'")')], exclude_fields=None, return_data_class=True
         (output_format_prompt): Prompt(
         template: Your output should be formatted as a standard JSON instance with the following schema:
         ```
         {{schema}}
         ```
         {% if example %}
         Examples:
         ```
         {{example}}
         ```
         {% endif %}
         -Make sure to always enclose the JSON output in triple backticks (```). Please do not add anything other than valid JSON output!
         -Use double quotes for the keys and string values.
         -DO NOT mistaken the "properties" and "type" in the schema as the actual fields in the JSON output.
         -Follow the JSON formatting conventions., prompt_variables: ['example', 'schema']
         )
         (output_processors): JsonParser()
      )
   )
)
         
     

Now, let’s run the test function to see the agent in action.

test_react_agent(ModelClientType.GROQ(), llama3_model_kwargs)
test_react_agent(ModelClientType.OPENAI(), gpt_model_kwargs)

Our agent will show the core steps for developers via colored printout, including input_query, steps, and the final answer. The printout of the first query with llama3 is shown below (without the color here):

2024-07-10 16:48:47 - [react.py:287:call] - input_query: What is the capital of France? and what is 465 times 321 then add 95297 and then divide by 13.2

2024-07-10 16:48:48 - [react.py:266:_run_one_step] - Step 1:
StepOutput(step=1, action=FunctionExpression(thought="Let's break down the query into subqueries and start with the first one.", action='llm_tool(input="What is the capital of France?")'), function=Function(thought=None, name='llm_tool', args=[], kwargs={'input': 'What is the capital of France?'}), observation='The capital of France is Paris!')
_______

2024-07-10 16:48:49 - [react.py:266:_run_one_step] - Step 2:
StepOutput(step=2, action=FunctionExpression(thought="Now, let's move on to the second subquery.", action='multiply(a=465, b=321)'), function=Function(thought=None, name='multiply', args=[], kwargs={'a': 465, 'b': 321}), observation=149265)
_______

2024-07-10 16:48:49 - [react.py:266:_run_one_step] - Step 3:
StepOutput(step=3, action=FunctionExpression(thought="Now, let's add 95297 to the result.", action='add(a=149265, b=95297)'), function=Function(thought=None, name='add', args=[], kwargs={'a': 149265, 'b': 95297}), observation=244562)
_______

2024-07-10 16:48:50 - [react.py:266:_run_one_step] - Step 4:
StepOutput(step=4, action=FunctionExpression(thought="Now, let's divide the result by 13.2.", action='divide(a=244562, b=13.2)'), function=Function(thought=None, name='divide', args=[], kwargs={'a': 244562, 'b': 13.2}), observation=18527.424242424244)
_______

2024-07-10 16:48:50 - [react.py:266:_run_one_step] - Step 5:
StepOutput(step=5, action=FunctionExpression(thought="Now, let's combine the answers of both subqueries.", action='finish(answer="The capital of France is Paris! and the result of the mathematical operation is 18527.424242424244.")'), function=Function(thought=None, name='finish', args=[], kwargs={'answer': 'The capital of France is Paris! and the result of the mathematical operation is 18527.424242424244.'}), observation='The capital of France is Paris! and the result of the mathematical operation is 18527.424242424244.')
_______
2024-07-10 16:48:50 - [react.py:301:call] - answer:
The capital of France is Paris! and the result of the mathematical operation is 18527.424242424244.

For the second query, the printout:

2024-07-10 16:48:51 - [react.py:287:call] - input_query: Give me 5 words rhyming with cool, and make a 4-sentence poem using them
2024-07-10 16:48:52 - [react.py:266:_run_one_step] - Step 1:
StepOutput(step=1, action=FunctionExpression(thought="I need to find 5 words that rhyme with 'cool'.", action='llm_tool(input="What are 5 words that rhyme with \'cool\'?")'), function=Function(thought=None, name='llm_tool', args=[], kwargs={'input': "What are 5 words that rhyme with 'cool'?"}), observation='Here are 5 words that rhyme with "cool":\n\n1. Rule\n2. Tool\n3. Fool\n4. Pool\n5. School')
_______

2024-07-10 16:49:00 - [react.py:266:_run_one_step] - Step 2:
StepOutput(step=2, action=FunctionExpression(thought='Now that I have the rhyming words, I need to create a 4-sentence poem using them.', action='llm_tool(input="Create a 4-sentence poem using the words \'rule\', \'tool\', \'fool\', \'pool\', and \'school\'.")'), function=Function(thought=None, name='llm_tool', args=[], kwargs={'input': "Create a 4-sentence poem using the words 'rule', 'tool', 'fool', 'pool', and 'school'."}), observation="Here is a 4-sentence poem using the words 'rule', 'tool', 'fool', 'pool', and 'school':\n\nIn the classroom, we learn to rule,\nWith a pencil as our trusty tool.\nBut if we're not careful, we can be a fool,\nAnd end up swimming in the school pool.")
_______

2024-07-10 16:49:12 - [react.py:266:_run_one_step] - Step 3:
StepOutput(step=3, action=FunctionExpression(thought='I have the poem, now I need to finish the task.', action='finish(answer="Here are 5 words that rhyme with \'cool\': rule, tool, fool, pool, school. Here is a 4-sentence poem using the words: In the classroom, we learn to rule, With a pencil as our trusty tool. But if we\'re not careful, we can be a fool, And end up swimming in the school pool.")'), function=Function(thought=None, name='finish', args=[], kwargs={'answer': "Here are 5 words that rhyme with 'cool': rule, tool, fool, pool, school. Here is a 4-sentence poem using the words: In the classroom, we learn to rule, With a pencil as our trusty tool. But if we're not careful, we can be a fool, And end up swimming in the school pool."}), observation="Here are 5 words that rhyme with 'cool': rule, tool, fool, pool, school. Here is a 4-sentence poem using the words: In the classroom, we learn to rule, With a pencil as our trusty tool. But if we're not careful, we can be a fool, And end up swimming in the school pool.")
_______

2024-07-10 16:49:12 - [react.py:301:call] - answer:
Here are 5 words that rhyme with 'cool': rule, tool, fool, pool, school. Here is a 4-sentence poem using the words: In the classroom, we learn to rule, With a pencil as our trusty tool. But if we're not careful, we can be a fool, And end up swimming in the school pool.

The comparison between the agent and the vanilla LLM response is shown below:

Answer with agent: The capital of France is Paris! and the result of the mathematical operation is 18527.424242424244.
Answer without agent: GeneratorOutput(data="I'd be happy to help you with that!\n\nThe capital of France is Paris.\n\nNow, let's tackle the math problem:\n\n1. 465 × 321 = 149,485\n2. Add 95,297 to that result: 149,485 + 95,297 = 244,782\n3. Divide the result by 13.2: 244,782 ÷ 13.2 = 18,544.09\n\nSo, the answer is 18,544.09!", error=None, usage=None, raw_response="I'd be happy to help you with that!\n\nThe capital of France is Paris.\n\nNow, let's tackle the math problem:\n\n1. 465 × 321 = 149,485\n2. Add 95,297 to that result: 149,485 + 95,297 = 244,782\n3. Divide the result by 13.2: 244,782 ÷ 13.2 = 18,544.09\n\nSo, the answer is 18,544.09!", metadata=None)

For the second query, the comparison is shown below:

Answer with agent: Here are 5 words that rhyme with 'cool': rule, tool, fool, pool, school. Here is a 4-sentence poem using the words: In the classroom, we learn to rule, With a pencil as our trusty tool. But if we're not careful, we can be a fool, And end up swimming in the school pool.
Answer without agent: GeneratorOutput(data='Here are 5 words that rhyme with "cool":\n\n1. rule\n2. tool\n3. fool\n4. pool\n5. school\n\nAnd here\'s a 4-sentence poem using these words:\n\nIn the summer heat, I like to be cool,\nFollowing the rule, I take a dip in the pool.\nI\'m not a fool, I know just what to do,\nI grab my tool and head back to school.', error=None, usage=None, raw_response='Here are 5 words that rhyme with "cool":\n\n1. rule\n2. tool\n3. fool\n4. pool\n5. school\n\nAnd here\'s a 4-sentence poem using these words:\n\nIn the summer heat, I like to be cool,\nFollowing the rule, I take a dip in the pool.\nI\'m not a fool, I know just what to do,\nI grab my tool and head back to school.', metadata=None)

The ReAct agent is particularly helpful for answering queries that require capabilities like computation or more complicated reasoning and planning. However, using it on general queries might be an overkill, as it might take more steps than necessary to answer the query.

Customization#

Template

The first thing you want to customize is the template itself. You can do this by passing your own template to the agent’s constructor. We suggest you to modify our default template: DEFAULT_REACT_AGENT_SYSTEM_PROMPT.

Examples for Better Output Format

Secondly, the examples in the constructor allow you to provide more examples to enforce the correct output format. For instance, if we want it to learn how to correctly call multiply, we can pass in a list of FunctionExpression instances with the correct format. Classmethod from_function can be used to create a FunctionExpression instance from a function and its arguments.

from adalflow.core.types import FunctionExpression

# generate an example of calling multiply with key-word arguments
example_using_multiply = FunctionExpression.from_function(
     func=multiply,
     thought="Now, let's multiply two numbers.",
     a=3,
     b=4,
 )
examples = [example_using_multiply]

# pass it to the agent

We can visualize how this is passed to the planner prompt via:

react.planner.print_prompt()

The above example will be formated as:

<OUTPUT_FORMAT>
Your output should be formatted as a standard JSON instance with the following schema:
```
{
   "thought": "Why the function is called (Optional[str]) (optional)",
   "action": "FuncName(<kwargs>) Valid function call expression. Example: \"FuncName(a=1, b=2)\" Follow the data type specified in the function parameters.e.g. for Type object with x,y properties, use \"ObjectType(x=1, y=2) (str) (required)"
}
```
Examples:
```
{
   "thought": "Now, let's multiply two numbers.",
   "action": "multiply(a=3, b=4)"
}
________
{
   "thought": "I have finished the task.",
   "action": "finish(answer=\"final answer: 'answer'\")"
}
________
```
-Make sure to always enclose the JSON output in triple backticks (```). Please do not add anything other than valid JSON output!
-Use double quotes for the keys and string values.
-DO NOT mistaken the "properties" and "type" in the schema as the actual fields in the JSON output.
-Follow the JSON formatting conventions.
</OUTPUT_FORMAT>

Subclass ReActAgent

If you want to customize the agent further, you can subclass the ReActAgent and override the methods you want to change.

References