Building Autonomous Multi-Tool Agents with Gemini 2.0 and LangGraph

Rifx.Online
Autonomous Systems , Programming , Natural Language Processing
11 Jan, 2025

A practical tutorial with full code examples for building and running multi-tool agents

LLMs are remarkable — they can memorize vast amounts of information, answer general knowledge questions, write code, generate stories, and even fix your grammar. However, they are not without limitations. They hallucinate, have a knowledge cut-off that may range from a few months to several years, and are confined to generating text, unable to interact with the real world. This restricts their utility for tasks requiring real-time data, source citations, or functionalities beyond text generation. This is the main issue that Agents and Tools are attempting to solve: they bridge this gap by augmenting LLMs with additional capabilities. These improvements allow LLMs to access up-to-date information, interact with APIs, Search, and even influence the physical world, like adjusting a smart home’s temperature.

In this tutorial, we are going to build a simple LLM agent that is equipped with four tools that it can use to answer a user’s question. This Agent will have the following specifications:

Can answer general Knowledge questions with up-to-date verifiable information.
Can use four types of tools: DuckDuckGo Search, Fetch Webpage content, Wikipedia Search, Fetch Wikipedia Page Content.
Allows the LLM to reason about the use’s input, the message history, the previous tool call and their results in order to decide whether or not to use a tool next and if so, with which argument.
The Agent allows the use of multiple tools at once for each time step.

We will be using LangGraph for our agent implementation and Gemini 2.0 as our LLM. However, you will be able to switch to most other LLM providers with minimal code changes. Full code here: https://github.com/CVxTz/document_ai_agents/blob/master/document_ai_agents/document_multi_tool_agent.py

Tools

First, lets build one of the tools, the Web Search one for example:

from duckduckgo_search import DDGS
from pydantic import BaseModel

class PageSummary(BaseModel):
    page_title: str
    page_summary: str
    page_url: str


class SearchResponse(BaseModel):
    page_summaries: list[PageSummary]


def search_duck_duck_go(search_query: str) -> SearchResponse:
    """
    Searches through duckduckgo pages.
    :param search_query: Query to send to DuckDuckGo search.
    Search for one item at a time even if it means calling the tool multiple times.
    :return:
    """
    max_results = 10

    with DDGS() as dd:
        results_generator = dd.text(
            search_query,
            max_results=max_results,
            backend="api",
        )

    return SearchResponse(
        page_summaries=[
            PageSummary(
                page_title=x["title"], page_summary=x["body"], page_url=x["href"]
            )
            for x in results_generator
        ]
    )

Our “Tool” is a simple python function that uses the duckduckgo_search library to get search results related to a search query. It will be the LLM that will decide what query to pick given the user’s message. The output of this function looks like this:

{'page_summaries': [{'page_summary': 'Stevia is a plant-based sweetener that '
                                     'is 200 to 400 times sweeter than sugar '
                                     'and has no calories or carbohydrates. '
                                     'Learn about its health benefits, side '
                                     'effects, and how to use it in cooking '
                                     'and baking.',
                     'page_title': 'Stevia: Health Benefits and Risks - WebMD',
                     'page_url': 'https://www.webmd.com/food-recipes/what-is-stevia'},
                    {'page_summary': 'Stevia is a herb that can be used as a '
                                     'zero-calorie, zero-carb sugar '
                                     'substitute. Learn about its history, '
                                     'safety, potential health benefits and '
                                     'drawbacks from Cleveland Clinic experts.',
                     'page_title': 'Stevia: What Is It and Is It Healthy? - '
                                   'Cleveland Clinic Health Essentials',
                     'page_url': 'https://health.clevelandclinic.org/what-is-stevia'},
                    {'page_summary': 'Stevia is a sugar substitute extracted '
                                     'from the leaves of Stevia rebaudiana, a '
                                     'plant native to Paraguay and Brazil. '

...

Lets also take a look at the get_wikipedia_page function:

def get_wikipedia_page(page_title: str, max_text_size: int = 16_000):
    """
    Gets full content of a wikipedia page
    :param page_title: Make sure this page exists by calling the tool "search_wikipedia" first.
    :param max_text_size: defaults to 16000
    :return:
    """

    page = wikipedia.page(title=page_title, auto_suggest=False)
    full_content = strip_tags(page.html())
    full_page = FullPage(
        page_title=page.title,
        page_url=page.url,
        content=full_content[:max_text_size],
    )

    return full_page

This function gets the full html of the page and returns it stripped of html tags to the caller. The output looks like this:

{'content': 'Sweetener and sugar substitute\n'
            'This article is about the sweetener. For other uses, see Stevia '
            '(disambiguation).\n'
            '\n'
            'Stevia (/ˈstiːviə, ˈstɛviə/)[1][2] is a sweet sugar substitute '
            'that is about 50 to 300\xa0times sweeter than sugar.[3] It is '
            'extracted from the leaves of Stevia rebaudiana, a plant native to 
            ...
            ...
            '^ Samuel, Priscilla; Ayoob, Keith T; Magnuson, Bernadene A; '
            'Wölwer-Rieck, Ursula; Jeppesen, Per Bendix; Rogers, Peter J; '
            'Rowland, Ian; Mathews, Rebecca (1 July 2018). "Stevia Leaf to '
            'Stevia Sweetener: Exploring Its Science, Benefits, and Future '
            'Potential". The Journal of Nutrition. 148 (7): 1186S – 1205S. '
            'doi:10.1093/jn/nxy102. '
            'hdl:1983/618532e2-6caa-4fbe-bd6e-6eb0cb608981. ISSN\xa00022-3166. '
            'PMID\xa029982648.\n'
            '\n'
            '^ a b c d "Has stevia been approved by FDA to be used as a '
            'sweetener?". US Foo',
 'page_title': 'Stevia',
 'page_url': 'https://en.wikipedia.org/wiki/Stevia'}

Overall, we define four such functions:

search_wikipedia(search_query: str)
get_wikipedia_page(page_title: str, max_text_size: int = 16_000):
search_duck_duck_go(search_query: str)
get_page_content(page_title: str, page_url: str)

All these functions are passed to the Gemini Client as follows:

model = genai.GenerativeModel(
    "gemini-2.0-flash-exp",
    tools=[
            get_wikipedia_page,
            search_wikipedia,
            search_duck_duck_go,
            get_page_content,
        ]
   )

The client will infer the call arguments and their types from the function definition. It will also pass the the doc-string of the function in this generated schema, so we need to explain to the LLM how the tool works in that doc-string for best result.

The LLM

We will use the Gemini client to interact with the LLM, specifically Gemini 2.0. To get started, you’ll need to set up an API key, which can be obtained from Google AI Studio. Using this client, we will prompt the LLM to either generate responses or create function calls (or both). These functions calls will be then applied to our tools defined above.

When called with a prompt, the client may respond with regular text or with something that includes a function call, example:

{'function_call': 
  {'name': 'search_wikipedia', 'args': {'search_query': 'Trey Parker'}}
}

This type of response has the name of the tool and its arguments. This is what allows the LLM to choose which outside resources it wants to access and how to do it.

The Agent

Here is the full implementation of our agent (a mere 70 lines of code):

class ToolCallAgent:
    def __init__(self, tools: list[Callable], model_name="gemini-2.0-flash-exp"):
        self.model_name = model_name
        self.model = genai.GenerativeModel(
            self.model_name,
            tools=tools,
            system_instruction="You are a helpful agent that has access to different tools. Use them to answer the "
            "user's query if needed. Only use information from external sources that you can cite. "
            "You can use multiple tools before giving the final answer. "
            "If the tool response does not give an adequate response you can use the tools again with different inputs."
            "Only respond when you can cite the source from one of your tools."
            "Only answer I don't know after you have exhausted all ways to use the tools to search for that information.",
        )
        self.tools = tools
        self.tool_mapping = {tool.__name__: tool for tool in self.tools}
        self.graph = None
        self.build_agent()
    def call_llm(self, state: AgentState):
        response = self.model.generate_content(
            state.messages,
            request_options=RequestOptions(
                retry=retry.Retry(initial=10, multiplier=2, maximum=60, timeout=300)
            ),
        )
        return {
            "messages": [
                type(response.candidates[0].content).to_dict(
                    response.candidates[0].content
                )
            ]
        }
    def use_tool(self, state: AgentState):
        assert any("function_call" in part for part in state.messages[-1]["parts"])

        tool_result_parts = []

        for part in state.messages[-1]["parts"]:
            if "function_call" in part:
                name = part["function_call"]["name"]
                func = self.tool_mapping[name]
                result = func(**part["function_call"]["args"])
                tool_result_parts.append(
                    {
                        "function_response": {
                            "name": name,
                            "response": result.model_dump(mode="json"),
                        }
                    }
                )
        return {"messages": [{"role": "tool", "parts": tool_result_parts}]}
    @staticmethod
    def should_we_stop(state: AgentState) -> str:
        logger.debug(
            f"Entering should_we_stop function. Current message: {state.messages[-1]}"
        )  # Added log
        if any("function_call" in part for part in state.messages[-1]["parts"]):
            logger.debug(f"Calling tools: {state.messages[-1]['parts']}")
            return "use_tool"
        else:
            logger.debug("Ending agent invocation")
            return END
    def build_agent(self):
        builder = StateGraph(AgentState)
        builder.add_node("call_llm", self.call_llm)
        builder.add_node("use_tool", self.use_tool)

        builder.add_edge(START, "call_llm")
        builder.add_conditional_edges("call_llm", self.should_we_stop)
        builder.add_edge("use_tool", "call_llm")
        self.graph = builder.compile()

First, we define the LLM client with the set of tools and a simple prompt.
Second, we define a first node “call_llm” to call to the LLM API using the message history.
Third, we define a tool calling node that executes all function calls requested by the LLM.
Finally we define the agent’s Graph:

The agent will continue to run until new responses of the LLM does not include any tool call. This creates a cycle in the graph and is what allows the agent’s execution to refine the search query and tool use until it finds what the user is asking about.

Example runs:

Lets take this agent for a test run. We will start with a simple query:

Is puffer fish poisonous ? if so, explain why and list some other poisonous (not venomous) fish. Don’t cite Wikipedia only.

Here is how how the exchange went (I trimmed most of the text for brevity):

***User:***Is pufferfish poisonous? If so, explain why and list some other poisonous (not venomous) fish. Don’t cite Wikipedia only.

***Model:***function_call:name: search_duck_duck_goargs:search_query: are puffer fish poisonous

***Tool:***function_response:name: search_duck_duck_goresponse:page_summaries:

page_title: How The Poisonous Pufferfish Became A Delicacy (And The Deadly Mishaps … page_summary: Pufferfish, or fugu, is a popular and expe…. page_url: https://www.tastingtable.com/1616127/poisonous-pufferfish-explained/
page_title: Are Puffer Fish Poisonous To Touch? Here’s What You Need To Know page_summary: Puffer fish contain a deadly toxin called tetrodotoxin that can cause paralysis and death if ingested or touched. … page_url: https://safeharborfishing.com/are-puffer-fish-poisonous-to-touch-heres-what-you-need-to-know/
….

***Model:***Yes, pufferfish are poisonous. They contain a deadly neurotoxin called tetrodotoxin, which can cause paralysis and death if ingested or touched. This toxin is not destroyed by cooking or freezing. The pufferfish’s diet, which includes invertebrates and algae like mussels and shellfish, contributes to the presence of tetrodotoxin.

Here are some other poisonous (not venomous) fish:

Northern Stargazer: These fish have poisonous spines behind their eyes.
Toadfish: Certain species of toadfish have poisonous spines.
Rabbitfish: These fish have spines containing toxins.

…

This exchange involved one round of tool call, consisting of a websearch using DuckDuckGo and it was enough for the LLM to get all the information it needed.

Now lets try a more involved scenario where the agent will need to make many tool calls:

What is the number and season of the south park episode where they get time traveling immigrants? Who was the director of that episode? Where and when was he born ? Give me his wikipedia page link.

Without the tools, the models answer with:

The South Park episode you\’re referring to is **Season 18, Episode 7 titled “Grounded Vindaloop”**. It does not feature time-traveling immigrants, but rather focuses…

This answer is wrong.

Now, with the tools:

***User:***What is the number and season of the South Park episode where they get time-traveling immigrants? Who was the director of that episode? Where and when was he born? Give me his Wikipedia page link.

***Model:***function_call:name: search_wikipediaargs:search_query: south park time traveling immigrants episode

***Tool:***function_response:name: search_wikipediaresponse:page_summaries:

page_title: Goobacks page_summary: “Goobacks” is the seventh episode of the eighth season of the animated television series South Park, and the 118th overall episode of the series. In production order, … page_url: https://en.wikipedia.org/wiki/Goobacks
page_title: List of time travel works of fiction page_summary: Time travel is a common plot element in fiction. Works where it plays a prominent role are listed below. For stories of time travel in antiquity, see the history of the time travel concept. page_url: https://en.wikipedia.org/wiki/List_of_time_travel_works_of_fiction
…

***Model:***The South Park episode you’re referring to is called “Goobacks.” It is the seventh episode of the eighth season.

***Model:***function_call:name: get_wikipedia_pageargs:page_title: Goobacks

***Tool:***function_response:name: get_wikipedia_pageresponse:page_title: Goobackspage_url: https://en.wikipedia.org/wiki/Goobackscontent: 7th episode of the 8th season of South Park. “Goobacks” is the seventh episode of the eighth season of the animated television series South Park, and the 118th overall episode of the series. In production order, it is the season’s sixth episode. It originally aired on Comedy Central in the United States on April 28, 2004. The episode was directed by Trey Parker.

***Model:***The episode “Goobacks” was directed by Trey Parker.

***Model:***function_call:name: search_wikipediaargs:search_query: Trey Parker

***Tool:***function_response:name: search_wikipediaresponse:page_summaries:

page_title: Trey Parker page_summary: Randolph Severn “Trey” Parker III (born October 19, 1969) is an American actor, animator, writer, producer, director, and musician… page_url: https://en.wikipedia.org/wiki/Trey_Parker
…

***Model:***Trey Parker was born on October 19, 1969.

***Model:***function_call:name: get_wikipedia_pageargs:page_title: Trey Parker

***Tool:***function_response:name: get_wikipedia_pageresponse:page_title: Trey Parkerpage_url: https://en.wikipedia.org/wiki/Trey_Parkercontent: Randolph Severn “Trey” Parker III (born October 19, 1969) is an American actor, animator, writer, producer, director, and musician. He was born in Conifer, Colorado, U.S.

***Model:***Trey Parker was born in Conifer, Colorado, U.S. on October 19, 1969.

Here is his Wikipedia page link: https://en.wikipedia.org/wiki/Trey_Parker

The agent was able to iteratively make tool calls (four calls in total) in order to answer our convoluted question where the definitive answer was not in the first result but needed multiple reasoning steps and search queries.

Conclusion

In conclusion, we built an autonomous LLM agent capable of searching the web, browsing Wikipedia, and reasoning through the information it retrieves using Gemini 2.0 as its core LLM. This agent not only fetches relevant data but also refines its search queries based on initial results to locate the exact information requested by the user.

This implementation serves as a good foundation for creating your own autonomous agents. By defining custom tools as Python functions and integrating them into the agent, you can easily adjust it to your specific needs. The simplicity of the implementation, thanks to the LangGraph framework, makes customization straightforward.

Now, you’re equipped to start building powerful, competent agents applied to your own use cases and applications.