AI Agents are autonomous systems that can understand user requests, break them down into steps, and execute actions to accomplish tasks. They combine language models with tools and external functions to interact with their environment. This module covers how to build effective agents using the smolagents
library, which provides a lightweight framework for creating capable AI agents.
Module Overview
Building effective agents requires understanding three key components. First, retrieval capabilities allow agents to access and use relevant information from various sources. Second, function calling enables agents to take concrete actions in their environment. Finally, domain-specific knowledge and tooling equip agents for specialized tasks like code manipulation.
Contents
1️⃣ Retrieval Agents
Retrieval agents combine models with knowledge bases. These agents can search and synthesize information from multiple sources, leveraging vector stores for efficient retrieval and implementing RAG (Retrieval Augmented Generation) patterns. They are great at combining web search with custom knowledge bases while maintaining conversation context through memory systems. The module covers implementation strategies including fallback mechanisms for robust information retrieval.
2️⃣ Code Agents
Code agents are specialized autonomous systems designed for software development tasks. These agents excel at analyzing and generating code, performing automated refactoring, and integrating with development tools. The module covers best practices for building code-focused agents that can understand programming languages, work with build systems, and interact with version control while maintaining high code quality standards.
3️⃣ Custom Functions
Custom function agents extend basic AI capabilities through specialized function calls. This module explores how to design modular and extensible function interfaces that integrate directly with your application’s logic. You’ll learn to implement proper validation and error handling while creating reliable function-driven workflows. The focus is on building simple systems where agents can predictably interact with external tools and services.
Exercise Notebooks
Title | Description | Exercise | Link | Colab |
---|---|---|---|---|
Building a Research Agent | Create an agent that can perform research tasks using retrieval and custom functions | 🐢 Build a simple RAG agent 🐕 Add custom search functions 🦁 Create a full research assistant | Notebook |
Resources
- smolagents Documentation - Official docs for the smolagents library
- Building Effective Agents - Research paper on agent architectures
- Agent Guidelines - Best practices for building reliable agents
- LangChain Agents - Additional examples of agent implementations
- Function Calling Guide - Understanding function calling in LLMs
- RAG Best Practices - Guide to implementing effective RAG
Retrieval Agents
Agentic RAG (Retrieval Augmented Generation) combines the power of autonomous agents with knowledge retrieval capabilities. While traditional RAG systems simply use an LLM to answer queries based on retrieved information, agentic RAG takes this further by allowing the system to intelligently control its own retrieval and response process.
Traditional RAG has key limitations - it only performs a single retrieval step and relies on direct semantic similarity with the user query, which can miss relevant information. Agentic RAG addresses these challenges by empowering the agent to formulate its own search queries, critique results, and perform multiple retrieval steps as needed.
Basic Retrieval with DuckDuckGo
Let’s start by building a simple agent that can search the web using DuckDuckGo. This agent will be able to answer questions by retrieving relevant information and synthesizing responses.
from smolagents import CodeAgent, DuckDuckGoSearchTool, HfApiModel
# Initialize the search tool
search_tool = DuckDuckGoSearchTool()
# Initialize the model
model = HfApiModel()
agent = CodeAgent(
model = model,
tools=[search_tool]
)
# Example usage
response = agent.run(
"What are the latest developments in fusion energy?"
)
print(response)
The agent will:
- Analyze the query to determine what information is needed
- Use DuckDuckGo to search for relevant content
- Synthesize the retrieved information into a coherent response
- Store the interaction in its memory for future reference
Custom Knowledge Base Tool
For domain-specific applications, we often want to combine web search with our own knowledge base. Let’s create a custom tool that can query a vector database of technical documentation.
from smolagents import Tool
class RetrieverTool(Tool):
name = "retriever"
description = "Uses semantic search to retrieve the parts of transformers documentation that could be most relevant to answer your query."
inputs = {
"query": {
"type": "string",
"description": "The query to perform. This should be semantically close to your target documents. Use the affirmative form rather than a question.",
}
}
output_type = "string"
def __init__(self, docs, **kwargs):
super().__init__(**kwargs)
self.retriever = BM25Retriever.from_documents(
docs, k=10
)
def forward(self, query: str) -> str:
assert isinstance(query, str), "Your search query must be a string"
docs = self.retriever.invoke(
query,
)
return "\nRetrieved documents:\n" + "".join(
[
f"\n\n===== Document {str(i)} =====\n" + doc.page_content
for i, doc in enumerate(docs)
]
)
retriever_tool = RetrieverTool(docs_processed)
This enhanced agent can:
- First check the documentation for relevant information
- Fall back to web search if needed
- Combine information from both sources
- Maintain conversation context through memory
Enhanced Retrieval Capabilities
When building agentic RAG systems, the agent can employ sophisticated strategies like:
- Query Reformulation - Instead of using the raw user query, the agent can craft optimized search terms that better match the target documents
- Multi-Step Retrieval - The agent can perform multiple searches, using initial results to inform subsequent queries
- Source Integration - Information can be combined from multiple sources like web search and local documentation
- Result Validation - Retrieved content can be analyzed for relevance and accuracy before being included in responses
Effective agentic RAG systems require careful consideration of several key aspects. The agent should select between available tools based on the query type and context. Memory systems help maintain conversation history and avoid repetitive retrievals. Having fallback strategies ensures the system can still provide value even when primary retrieval methods fail. Additionally, implementing validation steps helps ensure the accuracy and relevance of retrieved information.
import datasets
from langchain.docstore.document import Document
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.retrievers import BM25Retriever
knowledge_base = datasets.load_dataset("m-ric/huggingface_doc", split="train")
knowledge_base = knowledge_base.filter(lambda row: row["source"].startswith("huggingface/transformers"))
source_docs = [
Document(page_content=doc["text"], metadata={"source": doc["source"].split("/")[1]})
for doc in knowledge_base
]
text_splitter = RecursiveCharacterTextSplitter(
chunk_size=500,
chunk_overlap=50,
add_start_index=True,
strip_whitespace=True,
separators=["\n\n", "\n", ".", " ", ""],
)
docs_processed = text_splitter.split_documents(source_docs)
Code Agents
Code agents are specialized autonomous systems that handle coding tasks like analysis, generation, refactoring, and testing. These agents leverage domain knowledge about programming languages, build systems, and version control to enhance software development workflows.
Why Code Agents?
Code agents accelerate development by automating repetitive tasks while maintaining code quality. They excel at generating boilerplate code, performing systematic refactoring, and identifying potential issues through static analysis. The agents combine retrieval capabilities to access external documentation and repositories with function calling to execute concrete actions like creating files or running tests.
Building Blocks of a Code Agent
Code agents are built on specialized language models fine-tuned for code understanding. These models are augmented with development tools like linters, formatters, and compilers to interact with real-world environments. Through retrieval techniques, agents maintain contextual awareness by accessing documentation and code histories to align with organizational patterns and standards. Action-oriented functions enable agents to perform concrete tasks such as committing changes or initiating merge requests.
In the following example, we create a code agent that can search the web using DuckDuckGo much like the retrieval agent we built earlier.
from smolagents import CodeAgent, DuckDuckGoSearchTool, HfApiModel
agent = CodeAgent(tools=[DuckDuckGoSearchTool()], model=HfApiModel())
agent.run("How many seconds would it take for a leopard at full speed to run through Pont des Arts?")
In the following example, we create a code agent that can get the travel time between two locations. Here, we use the @tool
decorator to define a custom function that can be used as a tool.
from smolagents import CodeAgent, HfApiModel, tool
@tool
def get_travel_duration(start_location: str, destination_location: str, departure_time: Optional[int] = None) -> str:
"""Gets the travel time in car between two places.
Args:
start_location: the place from which you start your ride
destination_location: the place of arrival
departure_time: the departure time, provide only a `datetime.datetime` if you want to specify this
"""
import googlemaps # All imports are placed within the function, to allow for sharing to Hub.
import os
gmaps = googlemaps.Client(os.getenv("GMAPS_API_KEY"))
if departure_time is None:
from datetime import datetime
departure_time = datetime(2025, 1, 6, 11, 0)
directions_result = gmaps.directions(
start_location,
destination_location,
mode="transit",
departure_time=departure_time
)
return directions_result[0]["legs"][0]["duration"]["text"]
agent = CodeAgent(tools=[get_travel_duration], model=HfApiModel(), additional_authorized_imports=["datetime"])
agent.run("Can you give me a nice one-day trip around Paris with a few locations and the times? Could be in the city or outside, but should fit in one day. I'm travelling only via public transportation.")
These examples are just the beginning of what you can do with code agents. You can learn more about how to build code agents in the smolagents documentation.
smolagents provides a lightweight framework for building code agents, with a core implementation of approximately 1,000 lines of code. The framework specializes in agents that write and execute Python code snippets, offering sandboxed execution for security. It supports both open-source and proprietary language models, making it adaptable to various development environments.
Further Reading
- smolagents Blog - Introduction to smolagents and code interactions
- smolagents: Building Good Agents - Best practices for reliable agents
- Building Effective Agents - Anthropic - Agent design principles
Custom Function Agents
Custom Function Agents are AI agents that leverage specialized function calls (or “tools”) to perform tasks. Unlike general-purpose agents, Custom Function Agents focus on powering advanced workflows by integrating directly with your application’s logic. For example, you can expose database queries, system commands, or any custom utility as isolated functions for the agent to invoke.
Why Custom Function Agents?
- Modular and Extensible: Instead of building one monolithic agent, you can design individual functions that represent discrete capabilities, making your architecture more extensible.
- Fine-Grained Control: Developers can carefully control the agent’s actions by specifying exactly which functions are available and what parameters they accept.
- Improved Reliability: By structuring each function with clear schemas and validations, you reduce errors and unexpected behaviors.
Basic Workflow
Identify Functions
Determine which tasks can be transformed into custom functions (e.g., file I/O, database queries, streaming data processing).Define the Interface
Use a function signature or schema that precisely outlines each function’s inputs, outputs, and expected behavior. This enforces strong contracts between your agent and its environment.Register with the Agent
Your agent needs to “learn” which functions are available. Typically, you pass metadata describing each function’s interface to the language model or agent framework.Invoke and Validate
Once the agent selects a function to call, run the function with the provided arguments and validate the results. If valid, feed the results back to the agent for context to drive subsequent decisions.
Example
Below is a simplified example demonstrating how custom function calls might look in pseudocode. The objective is to perform a user-defined search and retrieve relevant content:
# Define a custom function with clear input/output types
def search_database(query: str) -> list:
"""
Search the database for articles matching the query.
Args:
query (str): The search query string
Returns:
list: List of matching article results
"""
try:
results = database.search(query)
return results
except DatabaseError as e:
logging.error(f"Database search failed: {e}")
return []
# Register the function with the agent
agent.register_function(
name="search_database",
function=search_database,
description="Searches database for articles matching a query"
)
# Example usage
def process_search():
query = "Find recent articles on AI"
results = agent.invoke("search_database", query)
if results:
agent.process_results(results)
else:
logging.info("No results found for query")
Further Reading
- smolagents Blog - Learn about the latest advancements in AI agents and how they can be applied to custom function agents.
- Building Good Agents - A comprehensive guide on best practices for developing reliable and effective custom function agents.