OpenAI’s Swarm (Part 2): A straightforward, local-first approach with Ollama and Pydantic

Rifx.Online
Programming , Natural Language Processing , Chatbots
03 Jan, 2025

A short code reference to build upon.

TLDR:

Combining the Ollama and Swarm frameworks presents a local-first approach to building intelligent AI agents.

Ollama can run large language models locally, ensuring privacy and control, while Swarm provides a structured environment for designing and managing AI agents.

Our first-principled programming approach emphasizes simplicity and efficiency, avoiding using more complex frameworks that bring unnecessary abstractions, increasing the number of tokens used, and delaying the time to the first token.

Today, we’ll dive into a practical implementation that not only highlights how to create pydantic-supported agents but also demonstrates the power of Agentic function-calling and structured programming.

NOTE: The code is available in a gist here.

Understanding the Stack

Ollama Integration

The implementation leverages Ollama, an open-source framework for running large language models locally. What makes this setup particularly interesting is how it’s integrated using the OpenAI-compatible API interface:

client = OpenAI(base_url="http://localhost:11434/v1", api_key="ollama")

This configuration allows developers to use familiar OpenAI-style interactions while running models locally through Ollama. Our example uses the Qwen 2.5 Coder model (32B parameters), a very capable model.

The Swarm Framework

Swarm provides the foundational structure for creating and managing AI agents. It’s designed to facilitate:

Structured agent definitions
Function calling capabilities
Message handling and response processing
Context management

Deep Dive: Building an Information Extraction Agent

The implementation showcases a practical use case: an agent designed to extract structured information about people from unstructured text.

For example, with the following text:

"Pat Lesieur is a 65-year-old software developer skilled 
in AI Agents and RAG workflows."

And the following pydantic class:

## Define our Pydantic class to go with the structured output model
class PersonInfo(BaseModel):
    name: str
    age: int
    skills: List[str]
    bio: Optional[str] = None

The Agent has the following prompt:

instructions="""You are a precise information 
extraction agent that converts unstructured 
text about people into a specific JSON format.
IMPORTANT: When calling process_extracted_data, you MUST format the data exactly as follows:
{
    "name": "string",
    "age": number,
    "skills": ["skill1", "skill2"],  # MUST be a JSON array/list of strings
    "bio": "string"
}
The skills parameter MUST ALWAYS be a JSON array/list of strings, NOT a comma-separated string.
CORRECT format for skills:
  "skills": ["AI Agents", "RAG workflows"]
INCORRECT format for skills:
  "skills": "AI Agents, RAG workflows"
Example input: "John Smith is a 35-year-old software developer skilled in Python and Cloud Architecture."
You should call process_extracted_data with:
{
    "name": "John Smith",
    "age": 35,
    "skills": ["Python", "Cloud Architecture"],
    "bio": "Software developer"
}"""

To yield:

=== process_extracted_data called ===
Received data:
name: Pat Lesieur
age: 65
skills: ['AI Agents', 'RAG workflows']
bio: Software developer
Successfully created PersonInfo: name='Pat Lesieur' age=65 skills=['AI Agents', 'RAG workflows'] bio='Software developer'
=== process_extracted_data finished ===

=== Complete Response Details ===

Message type: assistant
Content: 
Tool calls: [
  {
    "id": "call_62rrvh2u",
    "function": {
      "arguments": "{\"age\":65,\"bio\":\"Software developer\",\"name\":\"Pat Lesieur\",\"skills\":[\"AI Agents\",\"RAG workflows\"]}",
      "name": "process_extracted_data"
    },
    "type": "function",
    "index": 0
  }
]

Message type: tool
Content: name='Pat Lesieur' age=65 skills=['AI Agents', 'RAG workflows'] bio='Software developer'

Agent Architecture

The core of the implementation revolves around a PersonInfo model defined using Pydantic:

class PersonInfo(BaseModel):
    name: str
    age: int
    skills: List[str]
    bio: Optional[str] = None

This structured approach ensures type safety and data validation, making the agent’s outputs reliable and consistent.

Agent Configuration

The agent is configured with specific instructions and capabilities:

def create_person_info_agent() -> Agent:
    return Agent(
        name="PersonInfoAgent",
        instructions="""...""",
        functions=[process_extracted_data]
    )

Key features include:

Clear instruction setting
Function registration for data processing
Structured output formatting

Robust Data Processing

The implementation includes sophisticated data cleaning and processing:

def clean_json_string(data_str: str) -> str:
    # Handles markdown code blocks and formatting
    if "```" in data_str:
        match = re.search(r'```(?:json)?\n(.*?)\n```', data_str, re.DOTALL)
        if match:
            data_str = match.group(1)
    return data_str.strip()

Running the Agent

The system brings everything together using the Swarm client:

swarm_client = Swarm(client=client)
response = swarm_client.run(
    agent=agent,
    model_override=model,
    messages=[{
        "role": "user",
        "content": input_text
    }],
    execute_tools=True
)

Key Benefits

Local Model Execution: By using Ollama, you maintain control over your data and can run models locally.
Structured Outputs: The Pydantic integration ensures type-safe and validated outputs.
Flexible Architecture: Modifying the agent’s instructions and data models can easily adapt the system for different use cases.
Developer-Friendly: The OpenAI-compatible interface makes it easy for developers familiar with OpenAI’s API to adapt.

Practical Applications

This implementation is particularly useful for:

Information extraction from unstructured text
Automated data processing pipelines
Building conversational AI agents
Creating structured data from natural language inputs

Conclusion

Combining Ollama and Swarm demonstrates a powerful approach to building AI agents. By leveraging local model execution through Ollama and the structured agent framework provided by Swarm, developers can create sophisticated AI applications that maintain data privacy while delivering reliable, structured outputs.

The implementation shows how modern AI development can be powerful and practical, combining the best of local model execution with structured programming practices.