AI Agents are autonomous systems that can understand user requests, break them down into steps, and execute actions to accomplish tasks. They combine language models with tools and external functions to interact with their environment. This module covers how to build effective agents using the smolagents library, which provides a lightweight framework for creating capable AI agents.

Module Overview

Building effective agents requires understanding three key components. First, retrieval capabilities allow agents to access and use relevant information from various sources. Second, function calling enables agents to take concrete actions in their environment. Finally, domain-specific knowledge and tooling equip agents for specialized tasks like code manipulation.

Contents

1️⃣ Retrieval Agents

Retrieval agents combine models with knowledge bases. These agents can search and synthesize information from multiple sources, leveraging vector stores for efficient retrieval and implementing RAG (Retrieval Augmented Generation) patterns. They are great at combining web search with custom knowledge bases while maintaining conversation context through memory systems. The module covers implementation strategies including fallback mechanisms for robust information retrieval.

2️⃣ Code Agents

Code agents are specialized autonomous systems designed for software development tasks. These agents excel at analyzing and generating code, performing automated refactoring, and integrating with development tools. The module covers best practices for building code-focused agents that can understand programming languages, work with build systems, and interact with version control while maintaining high code quality standards.

3️⃣ Custom Functions

Custom function agents extend basic AI capabilities through specialized function calls. This module explores how to design modular and extensible function interfaces that integrate directly with your application’s logic. You’ll learn to implement proper validation and error handling while creating reliable function-driven workflows. The focus is on building simple systems where agents can predictably interact with external tools and services.

Exercise Notebooks

TitleDescriptionExerciseLinkColab
Building a Research AgentCreate an agent that can perform research tasks using retrieval and custom functions🐢 Build a simple RAG agent
🐕 Add custom search functions
🦁 Create a full research assistant
NotebookOpen In Colab

Resources

Retrieval Agents

Agentic RAG (Retrieval Augmented Generation) combines the power of autonomous agents with knowledge retrieval capabilities. While traditional RAG systems simply use an LLM to answer queries based on retrieved information, agentic RAG takes this further by allowing the system to intelligently control its own retrieval and response process.

Traditional RAG has key limitations - it only performs a single retrieval step and relies on direct semantic similarity with the user query, which can miss relevant information. Agentic RAG addresses these challenges by empowering the agent to formulate its own search queries, critique results, and perform multiple retrieval steps as needed.

Basic Retrieval with DuckDuckGo

Let’s start by building a simple agent that can search the web using DuckDuckGo. This agent will be able to answer questions by retrieving relevant information and synthesizing responses.

from smolagents import CodeAgent, DuckDuckGoSearchTool, HfApiModel

# Initialize the search tool
search_tool = DuckDuckGoSearchTool()

# Initialize the model
model = HfApiModel()

agent = CodeAgent(
    model = model,
    tools=[search_tool]
)

# Example usage
response = agent.run(
    "What are the latest developments in fusion energy?"
)
print(response)

The agent will:

  1. Analyze the query to determine what information is needed
  2. Use DuckDuckGo to search for relevant content
  3. Synthesize the retrieved information into a coherent response
  4. Store the interaction in its memory for future reference

Custom Knowledge Base Tool

For domain-specific applications, we often want to combine web search with our own knowledge base. Let’s create a custom tool that can query a vector database of technical documentation.

from smolagents import Tool

class RetrieverTool(Tool):
    name = "retriever"
    description = "Uses semantic search to retrieve the parts of transformers documentation that could be most relevant to answer your query."
    inputs = {
        "query": {
            "type": "string",
            "description": "The query to perform. This should be semantically close to your target documents. Use the affirmative form rather than a question.",
        }
    }
    output_type = "string"

    def __init__(self, docs, **kwargs):
        super().__init__(**kwargs)
        self.retriever = BM25Retriever.from_documents(
            docs, k=10
        )

    def forward(self, query: str) -> str:
        assert isinstance(query, str), "Your search query must be a string"

        docs = self.retriever.invoke(
            query,
        )
        return "\nRetrieved documents:\n" + "".join(
            [
                f"\n\n===== Document {str(i)} =====\n" + doc.page_content
                for i, doc in enumerate(docs)
            ]
        )

retriever_tool = RetrieverTool(docs_processed)

This enhanced agent can:

  1. First check the documentation for relevant information
  2. Fall back to web search if needed
  3. Combine information from both sources
  4. Maintain conversation context through memory

Enhanced Retrieval Capabilities

When building agentic RAG systems, the agent can employ sophisticated strategies like:

  1. Query Reformulation - Instead of using the raw user query, the agent can craft optimized search terms that better match the target documents
  2. Multi-Step Retrieval - The agent can perform multiple searches, using initial results to inform subsequent queries
  3. Source Integration - Information can be combined from multiple sources like web search and local documentation
  4. Result Validation - Retrieved content can be analyzed for relevance and accuracy before being included in responses

Effective agentic RAG systems require careful consideration of several key aspects. The agent should select between available tools based on the query type and context. Memory systems help maintain conversation history and avoid repetitive retrievals. Having fallback strategies ensures the system can still provide value even when primary retrieval methods fail. Additionally, implementing validation steps helps ensure the accuracy and relevance of retrieved information.

import datasets
from langchain.docstore.document import Document
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.retrievers import BM25Retriever

knowledge_base = datasets.load_dataset("m-ric/huggingface_doc", split="train")
knowledge_base = knowledge_base.filter(lambda row: row["source"].startswith("huggingface/transformers"))

source_docs = [
    Document(page_content=doc["text"], metadata={"source": doc["source"].split("/")[1]})
    for doc in knowledge_base
]

text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=500,
    chunk_overlap=50,
    add_start_index=True,
    strip_whitespace=True,
    separators=["\n\n", "\n", ".", " ", ""],
)
docs_processed = text_splitter.split_documents(source_docs)

Code Agents

Code agents are specialized autonomous systems that handle coding tasks like analysis, generation, refactoring, and testing. These agents leverage domain knowledge about programming languages, build systems, and version control to enhance software development workflows.

Why Code Agents?

Code agents accelerate development by automating repetitive tasks while maintaining code quality. They excel at generating boilerplate code, performing systematic refactoring, and identifying potential issues through static analysis. The agents combine retrieval capabilities to access external documentation and repositories with function calling to execute concrete actions like creating files or running tests.

Building Blocks of a Code Agent

Code agents are built on specialized language models fine-tuned for code understanding. These models are augmented with development tools like linters, formatters, and compilers to interact with real-world environments. Through retrieval techniques, agents maintain contextual awareness by accessing documentation and code histories to align with organizational patterns and standards. Action-oriented functions enable agents to perform concrete tasks such as committing changes or initiating merge requests.

In the following example, we create a code agent that can search the web using DuckDuckGo much like the retrieval agent we built earlier.

from smolagents import CodeAgent, DuckDuckGoSearchTool, HfApiModel

agent = CodeAgent(tools=[DuckDuckGoSearchTool()], model=HfApiModel())

agent.run("How many seconds would it take for a leopard at full speed to run through Pont des Arts?")

In the following example, we create a code agent that can get the travel time between two locations. Here, we use the @tool decorator to define a custom function that can be used as a tool.

from smolagents import CodeAgent, HfApiModel, tool

@tool
def get_travel_duration(start_location: str, destination_location: str, departure_time: Optional[int] = None) -> str:
    """Gets the travel time in car between two places.
    
    Args:
        start_location: the place from which you start your ride
        destination_location: the place of arrival
        departure_time: the departure time, provide only a `datetime.datetime` if you want to specify this
    """
    import googlemaps # All imports are placed within the function, to allow for sharing to Hub.
    import os

    gmaps = googlemaps.Client(os.getenv("GMAPS_API_KEY"))

    if departure_time is None:
        from datetime import datetime
        departure_time = datetime(2025, 1, 6, 11, 0)

    directions_result = gmaps.directions(
        start_location,
        destination_location,
        mode="transit",
        departure_time=departure_time
    )
    return directions_result[0]["legs"][0]["duration"]["text"]

agent = CodeAgent(tools=[get_travel_duration], model=HfApiModel(), additional_authorized_imports=["datetime"])

agent.run("Can you give me a nice one-day trip around Paris with a few locations and the times? Could be in the city or outside, but should fit in one day. I'm travelling only via public transportation.")

These examples are just the beginning of what you can do with code agents. You can learn more about how to build code agents in the smolagents documentation.

smolagents provides a lightweight framework for building code agents, with a core implementation of approximately 1,000 lines of code. The framework specializes in agents that write and execute Python code snippets, offering sandboxed execution for security. It supports both open-source and proprietary language models, making it adaptable to various development environments.

Further Reading

Custom Function Agents

Custom Function Agents are AI agents that leverage specialized function calls (or “tools”) to perform tasks. Unlike general-purpose agents, Custom Function Agents focus on powering advanced workflows by integrating directly with your application’s logic. For example, you can expose database queries, system commands, or any custom utility as isolated functions for the agent to invoke.

Why Custom Function Agents?

  • Modular and Extensible: Instead of building one monolithic agent, you can design individual functions that represent discrete capabilities, making your architecture more extensible.
  • Fine-Grained Control: Developers can carefully control the agent’s actions by specifying exactly which functions are available and what parameters they accept.
  • Improved Reliability: By structuring each function with clear schemas and validations, you reduce errors and unexpected behaviors.

Basic Workflow

  1. Identify Functions
    Determine which tasks can be transformed into custom functions (e.g., file I/O, database queries, streaming data processing).

  2. Define the Interface
    Use a function signature or schema that precisely outlines each function’s inputs, outputs, and expected behavior. This enforces strong contracts between your agent and its environment.

  3. Register with the Agent
    Your agent needs to “learn” which functions are available. Typically, you pass metadata describing each function’s interface to the language model or agent framework.

  4. Invoke and Validate
    Once the agent selects a function to call, run the function with the provided arguments and validate the results. If valid, feed the results back to the agent for context to drive subsequent decisions.

Example

Below is a simplified example demonstrating how custom function calls might look in pseudocode. The objective is to perform a user-defined search and retrieve relevant content:

# Define a custom function with clear input/output types
def search_database(query: str) -> list:
    """
    Search the database for articles matching the query.
    
    Args:
        query (str): The search query string
        
    Returns:
        list: List of matching article results
    """
    try:
        results = database.search(query)
        return results
    except DatabaseError as e:
        logging.error(f"Database search failed: {e}")
        return []

# Register the function with the agent
agent.register_function(
    name="search_database",
    function=search_database,
    description="Searches database for articles matching a query"
)

# Example usage
def process_search():
    query = "Find recent articles on AI"
    results = agent.invoke("search_database", query)
    
    if results:
        agent.process_results(results)
    else:
        logging.info("No results found for query")

Further Reading

  • smolagents Blog - Learn about the latest advancements in AI agents and how they can be applied to custom function agents.
  • Building Good Agents - A comprehensive guide on best practices for developing reliable and effective custom function agents.