How to Develop a Free AI Agent with Automatic Internet Search

Rifx.Online
Programming , Technology/Web , Generative AI
14 Jan, 2025

Using Groq’s free API with LangGraph to develop an AI agent to answer users’ questions and automatically invoke internet search based on the question

If you are not a Medium member, you can read the full story at this link.

The agentic AI approach in generative AI is making strides and finding several potential applications. AI agents are like digital assistants that can perform multiple tasks on behalf of the user. They can understand the user’s question, the overall goal, reason and decide to choose the best action, and communicate with each other to complete the given task. For instance, an agentic system can extract key information from your CV, another agent can use this information to perform an exhaustive internet search to find relevant jobs matching your skills and expertise, and another agent can apply for these jobs on your behalf.

A number of agentic frameworks can be leveraged to develop agentic systems. Some notable agentic frameworks include LangGraph, AutoGen, and CrewAI. Experimenting with these frameworks requires API access to a proprietary Large Language Model (LLM) such as GPT-4o, Claude-3.5, Gemini-2.0, etc., or open-source models such as Llama-3.3, Mistral, Phi, etc., running locally. While open-source models can run locally and do not require purchasing any API key, a powerful computing machine with a general-purpose Graphical Processing Unit (GPU) is required to get a response from these models in a reasonable time.

In my following article, I explained how Grok’s free API can be used to start coding with their powerful LLMs.

In the current article, I will explain:

How “Groq’s” free API key can be used to run open-source models on Groq Cloud.
How a simple AI agent with a simple GUI can be developed using LangChain and LangGraph which answers user’s questions from its knowledge base and automatically invokes an AI-based web search to answer questions requiring recent information.

This AI agent can be extended for more complex tasks and use cases. The code of this application is present on my GitHub repository.

If you like this article and want to show some love:

Clap 50 times — each one helps more than you think! 👏
Follow me here on Medium and subscribe for free to catch my latest posts. 🫶
Let’s connect on LinkedIn

Grok vs. Groq

Note that Grok and Groq are two distinct AI products. Groq is a company that has developed custom AI chips called Language Processing Units (LPUs), designed to run large language models (LLMs) at lightning-fast speeds. On the other hand, Grok is an LLM series and a chatbot developed by Elon Musk’s xAI company. Grok is not a custom AI chip but a chatbot using traditional AI architectures.

Currently, Groq Cloud offers free API access to the following models. The whole code is available at my GitHub repository.

Getting Started with Groq Cloud’s Free API

To experiment with these models, you can get a free API key by creating a free account at Groq Cloud. Here is how you can access one of Groq Cloud’s open-source models, llama-3.3–70b-versatile, using Groq’s API and ChatGroq class.

from langchain_groq.chat_models import ChatGroq
import os

os.environ["GROQ_API_KEY"] = "GROQ_API_KEY" #replace it with your API key
## Initialize the Groq Llama Instant model
groq_llm = ChatGroq(model="llama-3.3–70b-versatile", temperature=0.7)
## Ask a simple question
question = "What is the capital of Finland?"
## Construct a simple prompt
prompt = (
    f"You are a helpful assistant.\n"
    f"Answer the following question clearly and concisely:\n\n"
    f"Question: {question}\n"
    f"Answer:"
)
## Get the response from the model
response = groq_llm.invoke(prompt)
## Print the response
print(response.content.strip())

AI Agents and LangGraph

LangGraph is a framework that is used to design a workflow for AI agents to define the sequence of actions or decisions in the form of a graph. For instance, in our example use case, an agent decides whether to answer the question from LLM’s internal knowledge or the question requires a web search to provide recent information.

AI agent is the combination of the workflow and the decision-making logic to intelligently answer questions or perform other complex tasks that need to be broken down into simpler sub-tasks.

LangGraph implements an agent in the form of a graph that has the following main components:

Nodes: Each node in LangGraph represents a step in the workflow (e.g., deciding if a query needs an internet search or generating a response using a model).
Edges: Edges connect nodes, and define the flow of decisions and actions.
State: It keeps track of information as it moves through the graph, so that the agent uses the correct data for each step.

Developing an Assistant AI Agent with LangGraph

In our use case, an AI agent is composed of:

1. Decision logic (a router function) to analyze the user’s question and decide whether the query should go to web search for recent information, or directly to the LLM for answer generation.

2. Actions (graph nodes) that the agent can take in a state, e.g., using web search to fetch the latest information, or directly using LLM to generate an answer.

3. Execution workflow (a state graph) that determines the flow of decisions and actions, including the start and end points, and transitions from one state to another.

4. Tools (LLM and Tavily) used by the agent to handle response generation.

In this example, I am using the Tavily search engine which is specifically designed for AI agents and LLMs. Tavily offers free API which can be obtained by creating an account at their website.

Let’s first add Tavily’s and Groq’s API keys to the environment variables and define our data structure in GraphState class to manage and track the flow of information through the LangGraph workflow. It defines the structure of the state that passes between nodes in our state graph.

os.environ["TAVILY_API_KEY"]="TAVILY_API_KEY"
os.environ["GROQ_API_KEY"]="GROQ_API_KEY"

class GraphState(TypedDict):
    question: str # user question
    generation: str  # answer
    websearch_content: str  # we store Tavily search results here, if any
    web_flag: str #To know whether a websearch was used to answer the question

Subsequently, I define a router function that routes the question to one of the two tools: i) web search, or ii) LLM. The following chain passes the names and descriptions of the two tools in a ChatPromptTemplate with a system prompt. If web search is selected, the web_flag state variable is set to “True” so that other states in the workflow decide their course of actions accordingly.

def route_question(state: GraphState) -> str:
    question = state["question"]
    web_flag = state.get("web_flag", "False")
    tool_selection = {
    "websearch": (
        "Questions requiring recent statistics, real-time information, recent news, or current updates. "
    ),
    "generate": (
        "Questions that require access to a large language model's general knowledge, but not requiring recent statistics, real-time information, recent news, or current updates."
    )
    }

    SYS_PROMPT = """Act as a router to select specific tools or functions based on user's question, using the following rules:
                    - Analyze the given question and use the given tool selection dictionary to output the name of the relevant tool based on its description and relevancy with the question. 
                    - The dictionary has tool names as keys and their descriptions as values. 
                    - Output only and only tool name, i.e., the exact key and nothing else with no explanations at all. 
                """

    # Define the ChatPromptTemplate
    prompt = ChatPromptTemplate.from_messages(
        [
            ("system", SYS_PROMPT),
            ("human", """Here is the question:
                        {question}
                        Here is the tool selection dictionary:
                        {tool_selection}
                        Output the required tool.
                    """),
        ]
    )

    # Pass the inputs to the prompt
    inputs = {
        "question": question,
        "tool_selection": tool_selection
    }

    # Invoke the chain
    tool = (prompt | st.session_state.llm | StrOutputParser()).invoke(inputs)
    tool = re.sub(r"[\\'\"`]", "", tool.strip()) # Remove any backslashes and extra spaces
    if tool == "websearch":
        state["web_flag"] = "True"
    print(f"Invoking {tool} tool through {st.session_state.llm.model_name}")
    return tool

The two state functions, websearch for performing an internet search with Tavily search engine, and generate to create the final response with the LLM, are developed as follows. The web_flag is also set to “True” by the websearch function which is accessed by the generate function to create its response accordingly. The LLM only cites responses if the context contains some URLs, meaning that the context has been provided by the web search.

#############################################################################
## Websearch function to fetch context from Tavily, store in state["websearch_content"]
#############################################################################
def websearch(state: GraphState) -> GraphState:
    """
    Uses Tavily to search the web for the question, then appends results into `websearch_content`.
    """
    question = state["question"]
    
    if "tavily_client" not in st.session_state:
        st.session_state.tavily_client = TavilyClient()

    try:
        print("Performing Tavily web search...")
        search_result = st.session_state.tavily_client.get_search_context(
            query=question,
            search_depth="advanced",
            max_tokens=2048
        )
        # The tavily_client may return a string or dict
        if isinstance(search_result, dict) and "documents" in search_result:
            # Merge all doc contents
            docs = [doc.get("content", "") for doc in search_result["documents"]]
            state["websearch_content"] = "\n\n".join(docs)
            state["web_flag"] = "True"
        else:
            # If it's just a string or something else
            state["websearch_content"] = str(search_result)
            state["web_flag"] = "True"

    except Exception as e:
        print(f"Error during Tavily web search: {e}")
        state["websearch_content"] = f"Error from Tavily: {e}"

    return state

#############################################################################
## Generation function that calls Groq LLM, optionally includes websearch content
#############################################################################
def generate(state: GraphState) -> GraphState:
    question = state["question"]
    context = state.get("websearch_content", "")
    web_flag = state.get("web_flag", "False")
    if "llm" not in st.session_state:
        raise RuntimeError("LLM not initialized. Please call initialize_app first.")

    prompt = (
        "You are a helpful assistant specialized in providing helpful information.\n\n"
        "If context is available, answer the question based on the context."
        "If there is no context with the question, answer the question from your own knowledge."
        "If web_flag is 'True', cite all the URLs in the context in this format: Sources: \n URL1\n URL2, ..."
        f"Question: {question}\n\n"
        f"Context: {context}\n\n"
        f"web_flag: {web_flag}"
        "Answer:"
        )
    try:
        response = st.session_state.llm.invoke(prompt)
        state["generation"] = response
    except Exception as e:
        state["generation"] = f"Error generating answer: {str(e)}"

    return state

Now, let’s develop the LangGraph pipeline. We create a StateGraph object from our GraphState data structure and add two nodes to it: websearch and generate. We will route the question to either websearch or generate function. From websearch state, the next state transition will be to generate and then to END. The conditional entry point in the workflow defines the routing logic.

#############################################################################
## Build the LangGraph pipeline
#############################################################################
workflow = StateGraph(GraphState)
## Add nodes
workflow.add_node("websearch", websearch)
workflow.add_node("generate", generate)
## We'll route from "route_question" to either "websearch" or "generate"
## Then from "websearch" -> "generate" -> END
## From "generate" -> END directly if no search is needed.
workflow.set_conditional_entry_point(
    route_question,  # The router function
    {
        "websearch": "websearch",
        "generate": "generate"
    }
)
workflow.add_edge("websearch", "generate")
workflow.add_edge("generate", END)

Now we will finally develop a streamlit user interface to ask questions and get responses.

import streamlit as st
from agent import initialize_app
import sys
import io

## Configure the Streamlit page layout
st.set_page_config(
    page_title="LangGraph Chatbot",
    layout="wide",
    initial_sidebar_state="expanded",
    page_icon="🤖"
)

## Initialize session state for messages
if "messages" not in st.session_state:
    st.session_state.messages = []

## Sidebar layout
with st.sidebar:
    st.title("🤖 LangGraph Chatbot")

    # Initialize session state for the model if it doesn't exist
    if "selected_model" not in st.session_state:
        st.session_state.selected_model = "llama-3.1-8b-instant"

    model_list = [
        "llama-3.1-8b-instant",
        "llama-3.3-70b-versatile",
        "llama3-70b-8192",
        "llama3-8b-8192",
        "mixtral-8x7b-32768",
        "gemma2-9b-it"
    ]

    st.session_state.selected_model = st.selectbox(
        "🤖 Select Model",
        model_list,
        key="model_selector",
        index=model_list.index(st.session_state.selected_model)
    )

    reset_button = st.button("🔄 Reset Conversation", key="reset_button")
    if reset_button:
        st.session_state.messages = []

## Initialize the LangGraph application with the selected model
app = initialize_app(model_name=st.session_state.selected_model)

## Title and description
st.title("📘 LangGraph Chat Interface")
st.markdown(
    """
    <div style="text-align: left; font-size: 18px; margin-top: 20px; line-height: 1.6;">
        🤖 <b>Welcome to the LangGraph Chatbot!</b><br>
        I can assist you by answering your questions using AI-powered workflows.
        <p style="margin-top: 10px;"><b>Start by typing your question below, and I'll provide an intelligent response!</b></p>
    </div>
    """,
    unsafe_allow_html=True
)

## Display conversation history
for message in st.session_state.messages:
    if message["role"] == "user":
        with st.chat_message("user"):
            st.markdown(f"**You:** {message['content']}")
    elif message["role"] == "assistant":
        with st.chat_message("assistant"):
            st.markdown(f"**Assistant:** {message['content']}")

## Input box for new messages
if user_input := st.chat_input("Type your question here (Max. 150 char):"):
    if len(user_input) > 150:
        st.error("Your question exceeds 150 characters. Please shorten it.")
    else:
        # Add user's message to session state and display it
        st.session_state.messages.append({"role": "user", "content": user_input})
        with st.chat_message("user"):
            st.markdown(f"**You:** {user_input}")

        # Capture print statements from agentic_rag.py
        output_buffer = io.StringIO()
        sys.stdout = output_buffer  # Redirect stdout to the buffer

        try:
            with st.chat_message("assistant"):
                response_placeholder = st.empty()
                debug_placeholder = st.empty()
                streamed_response = ""

                # Show spinner while streaming the response
                with st.spinner("Thinking..."):
                    inputs = {"question": user_input}
                    for i, output in enumerate(app.stream(inputs)):
                        # Capture intermediate print messages
                        debug_logs = output_buffer.getvalue()
                        debug_placeholder.text_area(
                            "Debug Logs",
                            debug_logs,
                            height=100,
                            key=f"debug_logs_{i}"
                        )

                        if "generate" in output and "generation" in output["generate"]:
                            chunk = output["generate"]["generation"]

                            # Safely extract the text content
                            if hasattr(chunk, "content"):  # If chunk is an AIMessage
                                chunk_text = chunk.content
                            else:  # Otherwise, convert to string
                                chunk_text = str(chunk)

                            # Append the text to the streamed response
                            streamed_response += chunk_text

                            # Update the placeholder with the streamed response so far
                            response_placeholder.markdown(f"**Assistant:** {streamed_response}")

                # Store the final response in session state
                st.session_state.messages.append({"role": "assistant", "content": streamed_response or "No response generated."})

        except Exception as e:
            # Handle errors and display in the conversation history
            error_message = f"An error occurred: {e}"
            st.session_state.messages.append({"role": "assistant", "content": error_message})
            with st.chat_message("assistant"):
                st.error(error_message)
        finally:
            # Restore stdout to its original state
            sys.stdout = sys.__stdout__

I provided an option to select several available models at Groq Cloud during runtime.

It can be seen that the agent decides the correct action/tool for the two queries shown in the above screenshot. The answer to the first question was provided from LLM’s internal knowledge. Whereas, the answer to the second question was provided through a web search.

This agent can be extended in several ways, e.g., adding more functions to perform search from trusted sources, and integrating Retrieval Augmented Generation (RAG) to answer queries from specific documents, among others.

That’s all folks! If you liked the article, please clap the article (multiple times 👏*), write a comment, and follow me on Medium and LinkedIn.*