Perfect! Let me create a comprehensive Pydantic AI note following the established style.

Pydantic AI

Overview

Pydantic AI is a Python framework developed by the creators of Pydantic for building production-grade AI agents with emphasis on type safety, data validation, and structured outputs. Released in late 2024, the framework applies FastAPI’s developer experience philosophy to AI agent development, providing type-safe interactions with large language models through Pydantic’s validation system.

The framework addresses the problem of unpredictable AI outputs and runtime errors through compile-time type checking, automatic schema validation, and structured response enforcement. Pydantic AI is model-agnostic supporting 10+ LLM providers and focuses on single-agent workflows rather than multi-agent orchestration.

Key technical components covered:

  • Agent architecture and dependency injection
  • Structured outputs and result types
  • Tools and function calling
  • Streaming responses with validation
  • Model provider abstraction
  • Result validators and custom validation
  • Testing with dependency mocking
  • Version history and API stability

Agent Architecture and Dependency Injection

Pydantic AI implements agents as type-safe systems interacting with LLMs through structured interfaces:

Agent initialization requires model specification and optional configurations:

from pydantic_ai import Agent
from pydantic import BaseModel
 
class CityInfo(BaseModel):
    name: str
    population: int
    country: str
 
agent = Agent(
    'openai:gpt-4',
    result_type=CityInfo,  # Enforce structured output
    system_prompt='Provide accurate city information',
)
 
result = agent.run_sync('Tell me about Paris')
print(result.output.population)  # Type-safe access: 2,161,000

Dependency injection system provides modular, testable architecture for external resources:

Defining dependencies using dataclasses:

from dataclasses import dataclass
import httpx
 
@dataclass
class MyDeps:
    api_key: str
    http_client: httpx.AsyncClient
    database_url: str

Agent configuration with dependency types:

agent = Agent(
    'openai:gpt-4o',
    deps_type=MyDeps,  # Specify dependency structure
)

Accessing dependencies through RunContext:

from pydantic_ai import RunContext
 
@agent.system_prompt
async def get_system_prompt(ctx: RunContext[MyDeps]) -> str:
    response = await ctx.deps.http_client.get(
        'https://api.example.com/config',
        headers={'Authorization': f'Bearer {ctx.deps.api_key}'},
    )
    response.raise_for_status()
    return f'System configuration: {response.text}'

Running agent with dependencies:

async def main():
    async with httpx.AsyncClient() as client:
        deps = MyDeps(
            api_key='sk_test_key',
            http_client=client,
            database_url='postgresql://localhost/db'
        )
        result = await agent.run('Query here', deps=deps)
        print(result.output)

Dependency injection benefits:

  • Modularity: Decouples agent logic from external services
  • Testability: Easy substitution with mocks during testing
  • Type safety: Pydantic validates dependency types at runtime
  • Context management: Automatic cleanup through context managers

RunContext object provides access to:

  • deps: Injected dependencies
  • usage: Token usage statistics
  • messages: Conversation history
  • model: Current model instance
  • retry: Retry mechanism for validation failures

Structured Outputs and Result Types

Pydantic AI enforces structured outputs through result types ensuring predictable, validated responses:

Simple structured output:

from pydantic import BaseModel
 
class CityLocation(BaseModel):
    city: str
    country: str
 
agent = Agent('google-gla:gemini-1.5-flash', result_type=CityLocation)
result = agent.run_sync('Where were the olympics held in 2012?')
print(result.output)  # CityLocation(city='London', country='United Kingdom')

Multiple output types using unions:

from pydantic_ai import Agent
 
agent = Agent(
    'openai:gpt-4o-mini',
    result_type=list[str] | list[int],  # Union type
    system_prompt='Extract either colors or sizes from shapes.',
)
 
result = agent.run_sync('red square, blue circle, green triangle')
print(result.output)  # ['red', 'blue', 'green']
 
result = agent.run_sync('square size 10, circle size 20, triangle size 30')
print(result.output)  # [10, 20, 30]

Complex nested structures:

from typing import List
from pydantic import BaseModel, Field
 
class Address(BaseModel):
    street: str
    city: str
    postal_code: str
 
class Person(BaseModel):
    name: str = Field(description="Full name")
    age: int = Field(ge=0, le=150)
    email: str = Field(pattern=r'^[\w\.-]+@[\w\.-]+\.\w+$')
    addresses: List[Address]
 
agent = Agent('openai:gpt-4', result_type=Person)
result = agent.run_sync('''
    Extract person info: John Doe, 35 years old, john@example.com,
    lives at 123 Main St, NYC 10001 and 456 Oak Ave, LA 90001
''')
print(result.output.addresses[0].city)  # 'NYC'

Validation enforcement: Pydantic automatically validates:

  • Type correctness (str, int, float, bool)
  • Field constraints (min/max values, regex patterns)
  • Required vs optional fields
  • Nested model structures
  • Custom validators

Schema generation: Agent automatically generates JSON schema from Pydantic models and passes to LLM for structured output generation. LLMs with native structured output support (GPT-4, Claude 3.5) produce validated JSON directly.

Tools and Function Calling

Tools enable agents to perform actions beyond text generation:

Basic tool definition:

from pydantic_ai import Agent, RunContext
 
agent = Agent('openai:gpt-4')
 
@agent.tool
def get_weather(location: str) -> str:
    """Get current weather for a location."""
    # Implementation here
    return f"Weather in {location}: 72°F, sunny"
 
result = agent.run_sync('What is the weather in San Francisco?')

Tools with dependencies:

from dataclasses import dataclass
import httpx
 
@dataclass
class WeatherDeps:
    api_key: str
    http_client: httpx.AsyncClient
 
weather_agent = Agent('openai:gpt-4', deps_type=WeatherDeps)
 
@weather_agent.tool
async def get_weather(ctx: RunContext[WeatherDeps], location: str) -> dict:
    """Fetch real-time weather data."""
    response = await ctx.deps.http_client.get(
        f'https://api.weather.com/forecast?location={location}',
        headers={'Authorization': f'Bearer {ctx.deps.api_key}'}
    )
    return response.json()

Tool with structured return type:

from pydantic import BaseModel
 
class WeatherData(BaseModel):
    temperature: float
    conditions: str
    humidity: int
    wind_speed: float
 
@agent.tool
def get_detailed_weather(location: str) -> WeatherData:
    """Get detailed weather information."""
    return WeatherData(
        temperature=72.5,
        conditions='Partly cloudy',
        humidity=65,
        wind_speed=8.3
    )

Multiple tools:

@agent.tool
def search_database(query: str) -> list[dict]:
    """Search internal database."""
    return [{"id": 1, "title": "Result"}]
 
@agent.tool
def send_email(to: str, subject: str, body: str) -> bool:
    """Send email notification."""
    # Email sending logic
    return True
 
# Agent automatically selects appropriate tool based on task
result = agent.run_sync('Search for users and email the results')

Tool calling flow:

  1. Agent analyzes user input and available tools
  2. Agent generates tool call with parameters
  3. Framework validates parameters against tool signature
  4. Tool function executes with validated parameters
  5. Tool result returned to agent
  6. Agent incorporates result into response generation
  7. Process repeats if additional tool calls needed

Function calling support varies by model:

  • OpenAI (GPT-4, GPT-3.5): Native function calling
  • Anthropic (Claude 3+): Native tool use
  • Google (Gemini Pro): Function declarations
  • Others: Prompt-based tool invocation

Streaming Responses with Validation

Pydantic AI validates streaming responses in real-time using partial validation:

Basic streaming:

from pydantic import BaseModel
 
class UserProfile(BaseModel):
    name: str
    age: int
    bio: str | None = None
 
agent = Agent('openai:gpt-4o', result_type=UserProfile)
 
async def stream_example():
    user_input = 'My name is Ben, I am 30 years old, love hiking.'
    async with agent.run_stream(user_input) as result:
        async for profile in result.stream(debounce_by=0.1):
            print(profile)  # Prints progressively complete UserProfile

Streaming with validation:

from pydantic import ValidationError
 
async def validated_stream():
    async with agent.run_stream('user input') as result:
        async for message, is_last in result.stream_responses():
            try:
                profile = await result.validate_response_output(
                    message,
                    allow_partial=not is_last  # Partial validation until final
                )
                print(f"Valid partial: {profile}")
            except ValidationError as e:
                print(f"Validation error: {e}")

Text streaming methods:

stream_text(delta=True): Yields each new chunk

async with agent.run_stream('Tell me a story') as result:
    async for chunk in result.stream_text(delta=True):
        print(chunk, end='', flush=True)  # Prints: "O", "n", "c", "e", ...

stream_text(delta=False): Yields accumulated text

async with agent.run_stream('Tell me a story') as result:
    async for text_so_far in result.stream_text(delta=False):
        print(f"\rCurrent: {text_so_far}", end='')  # Prints: "O", "On", "Onc", ...

Partial validation mechanism:

  • Pydantic 2.10+ supports validating incomplete data structures
  • Missing required fields allowed during streaming
  • Final validation ensures complete, valid output
  • Enables early error detection during streaming

Streaming tool calls: Agent can invoke tools mid-stream, pause output, execute tool, resume streaming with tool results incorporated.

Performance optimization: Debouncing reduces update frequency preventing excessive renders:

async for item in result.stream(debounce_by=0.1):  # Update max every 100ms
    display(item)

Model Provider Abstraction

Pydantic AI supports 10+ model providers through unified interface:

OpenAI configuration:

from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIModel
 
agent = Agent(
    OpenAIModel('gpt-4o', api_key='sk-...'),
    # Or shorthand:
    # 'openai:gpt-4o'
)

Anthropic configuration:

from pydantic_ai.models.anthropic import AnthropicModel
 
agent = Agent(
    AnthropicModel('claude-3-5-sonnet-latest', api_key='sk-ant-...'),
    # Or shorthand:
    # 'anthropic:claude-3-5-sonnet-latest'
)

Google Gemini configuration:

from pydantic_ai.models.gemini import GeminiModel
 
agent = Agent(
    GeminiModel('gemini-2.0-flash-exp', api_key='...'),
    # Or shorthand:
    # 'gemini:gemini-2.0-flash-exp'
)

Supported providers:

  • OpenAI (GPT-4, GPT-3.5, GPT-4o)
  • Anthropic (Claude 3 Opus, Sonnet, Haiku)
  • Google (Gemini Pro, Flash, Ultra)
  • DeepSeek
  • Grok (xAI)
  • Cohere
  • Mistral
  • Perplexity
  • Groq
  • Together AI

Model settings control generation behavior:

from pydantic_ai import Agent
from pydantic_ai.settings import ModelSettings
 
agent = Agent(
    'openai:gpt-4',
    model_settings=ModelSettings(
        temperature=0.7,        # Randomness (0-2)
        max_tokens=1000,        # Generation limit
        top_p=0.9,             # Nucleus sampling
        presence_penalty=0.1,   # Reduce repetition
        frequency_penalty=0.1,  # Penalize frequent tokens
    )
)

Provider switching requires only model name change:

# Development with fast model
dev_agent = Agent('openai:gpt-3.5-turbo')
 
# Production with capable model
prod_agent = Agent('anthropic:claude-3-5-sonnet-latest')
 
# Both agents use identical code, tools, and logic

Custom model implementation: Developers can implement Model protocol for unsupported providers, enabling Pydantic AI usage with any LLM.

Result Validators and Custom Validation

Result validators provide custom validation beyond Pydantic’s built-in mechanisms:

Output validator for complex validation:

from pydantic import BaseModel
from pydantic_ai import Agent, RunContext, ModelRetry
 
class Success(BaseModel):
    sql_query: str
 
class InvalidRequest(BaseModel):
    error_message: str
 
Response = Success | InvalidRequest
 
agent = Agent(
    'gemini-1.5-flash',
    result_type=Response,
    system_prompt='Generate PostgreSQL SQL queries.',
)
 
@agent.result_validator
async def validate_result(ctx: RunContext, result: Response) -> Response:
    if isinstance(result, InvalidRequest):
        return result  # Accept error responses as-is
    
    try:
        # Validate SQL by explaining it
        await ctx.deps.execute(f'EXPLAIN {result.sql_query}')
    except QueryError as e:
        # Retry with feedback
        raise ModelRetry(f'Invalid query: {e}') from e
    
    return result

ModelRetry mechanism: Raises ModelRetry exception to request new generation with feedback. Agent receives error context and attempts improved response. Supports retry limits to prevent infinite loops.

Field validators within Pydantic models:

from pydantic import BaseModel, field_validator
 
class UserInput(BaseModel):
    username: str
    age: int
 
    @field_validator('username')
    @classmethod
    def username_must_contain_space(cls, v: str) -> str:
        if ' ' not in v:
            raise ValueError('must contain a space')
        return v.title()
    
    @field_validator('age')
    @classmethod
    def age_reasonable(cls, v: int) -> int:
        if not 0 <= v <= 150:
            raise ValueError('age must be 0-150')
        return v

Model validators for cross-field validation:

from pydantic import BaseModel, model_validator
 
class DateRange(BaseModel):
    start_date: str
    end_date: str
 
    @model_validator(mode='after')
    def check_dates(self) -> 'DateRange':
        from datetime import datetime
        start = datetime.fromisoformat(self.start_date)
        end = datetime.fromisoformat(self.end_date)
        if start > end:
            raise ValueError('start_date must be before end_date')
        return self

Annotated validators for reusable validation:

from typing import Annotated
from pydantic import AfterValidator
 
def is_even(value: int) -> int:
    if value % 2 != 0:
        raise ValueError(f'{value} is not even')
    return value
 
EvenNumber = Annotated[int, AfterValidator(is_even)]
 
class Model(BaseModel):
    number: EvenNumber  # Automatically validates as even

Testing with Dependency Mocking

Pydantic AI’s dependency injection enables comprehensive testing:

Overriding dependencies for tests:

from unittest.mock import AsyncMock
import pytest
 
@pytest.mark.asyncio
async def test_agent_with_mock():
    # Create mock dependencies
    mock_client = AsyncMock()
    mock_client.get.return_value.json.return_value = {
        'temperature': 75,
        'conditions': 'sunny'
    }
    
    test_deps = MyDeps(
        api_key='test_key',
        http_client=mock_client,
        database_url='sqlite:///:memory:'
    )
    
    # Override agent dependencies
    with agent.override(deps=test_deps):
        result = await agent.run('What is the weather?')
        
        # Verify mock was called correctly
        mock_client.get.assert_called_once()
        assert 'sunny' in result.output

Mocking LLM responses:

from pydantic_ai.models.test import TestModel
 
test_model = TestModel()
test_agent = Agent(test_model)
 
# Define expected response
test_model.add_response('Paris is the capital of France')
 
result = test_agent.run_sync('What is the capital of France?')
assert 'Paris' in result.output

Testing tool calls:

async def test_tool_invocation():
    tool_called = False
    
    @agent.tool
    def test_tool(param: str) -> str:
        nonlocal tool_called
        tool_called = True
        return f"Processed: {param}"
    
    result = await agent.run('Use test tool with param "hello"')
    
    assert tool_called
    assert 'Processed: hello' in result.output

Testing validation errors:

from pydantic import ValidationError
 
async def test_validation_failure():
    agent = Agent('openai:gpt-4', result_type=UserProfile)
    
    with pytest.raises(ValidationError) as exc_info:
        # Force invalid output
        result = await agent.run('Generate invalid user data')
    
    assert 'age' in str(exc_info.value)  # Check specific field error

Snapshot testing for regression prevention:

async def test_agent_output_snapshot(snapshot):
    result = await agent.run('Standard test prompt')
    snapshot.assert_match(result.output.model_dump_json(), 'output.json')

Version History and API Stability

v1.0.0 (September 4, 2025): Major stable release with API stability commitment.

  • Human-in-the-Loop Tool Approval: Agents request user input preventing autonomous errors
  • Durable Execution with Temporal: Workflows resume after interruptions seamlessly
  • API Stability Commitment: Backward compatibility guaranteed until April 2026 minimum

v0.3.2 (June 21, 2025): MCP integration enhancement.

  • MCP Sampling Support: Model Context Protocol sampling for enhanced integration

v0.3.0 (June 18, 2025): Advanced reasoning features.

  • Thinking Parts Support: Agents process and manage complex reasoning steps explicitly

v0.2.20 (June 18, 2025): MCP server improvements.

  • MCP Tool Call Handling: Enhanced McpError handling from MCP tool calls
  • Hooks for Tool Modification: Modify tool arguments and return values during MCP integration

v0.0.51 (April 3, 2025): Runtime improvements.

  • Deno Runtime for MCP: Switched mcp-run-python server to Deno for performance and security

API stability promise: From v1.0.0, breaking changes require major version increment. Framework follows semantic versioning strictly. Deprecation warnings provided at least one minor version before removal.

Backward compatibility: Code written for v1.0.0 continues working with v1.x releases. Migration guides provided for major version upgrades.

Release cadence: Minor releases every 2-3 months with features and improvements. Patch releases as needed for bug fixes and security updates.

Pydantic AI v1.0.0 marks transition from experimental framework to production-ready platform with enterprise support and stability commitments.