ai-infra / API Reference

Agent

from ai_infra.llm import Agent

ai_infra.llm

Extends:BaseLLM

Agent-oriented interface (tool calling, streaming updates, fallbacks). The Agent class provides a simple API for running LLM agents with tools. Tools can be plain Python functions, LangChain tools, or MCP tools. Example - Basic usage:

python

def get_weather(city: str) -> str:
        '''Get weather for a city.'''
        return f"Weather in {city}: Sunny, 72°F"

    # Simple usage with tools
    agent = Agent(tools=[get_weather])
    result = agent.run("What's the weather in NYC?")

Example - With session memory (conversations persist):

python

from ai_infra.llm.session import memory

    agent = Agent(tools=[...], session=memory())

    # Conversation 1 - remembered
    agent.run("I'm Bob", session_id="user-123")
    agent.run("What's my name?", session_id="user-123")  # Knows "Bob"

    # Different session - fresh start
    agent.run("What's my name?", session_id="user-456")  # Doesn't know

Example - Pause and resume (HITL):

python

from ai_infra.llm.session import memory

    agent = Agent(
        tools=[dangerous_tool],
        session=memory(),
        pause_before=["dangerous_tool"],  # Pause before this tool
    )

    result = agent.run("Delete file.txt", session_id="task-1")

    if result.paused:
        # Show user what's pending, get approval
        print(result.pending_action)

        # Resume with decision
        result = agent.resume(session_id="task-1", approved=True)

Example - Production with Postgres:

python

from ai_infra.llm.session import postgres

    agent = Agent(
        tools=[...],
        session=postgres("postgresql://..."),
    )
    # Sessions persist across restarts

Example - Human approval (sync, per-request):

python

agent = Agent(
        tools=[dangerous_tool],
        require_approval=True,  # Console prompt for approval
    )

Example - DeepAgents mode (autonomous multi-step tasks):

python

from ai_infra.llm import Agent
    from ai_infra.llm.session import memory

    # Define specialized agents
    researcher = Agent(
        name="researcher",
        description="Searches and analyzes code",
        system="You are a code research assistant.",
        tools=[search_codebase],
    )

    writer = Agent(
        name="writer",
        description="Writes and edits documentation",
        system="You are a technical writer.",
    )

    # Create a deep agent that can delegate to subagents
    agent = Agent(
        deep=True,
        session=memory(),
        subagents=[researcher, writer],  # Agents auto-convert to subagents
    )

    # The agent can now autonomously:
    # - Read/write/edit files
    # - Execute shell commands
    # - Delegate to subagents
    # - Maintain todo lists
    result = agent.run("Refactor the auth module to use JWT tokens")

Constructor

Agent(tools: list[Any] | None = None, provider: str | None = None, model_name: str | None = None, name: str | None = None, description: str | None = None, system: str | None = None, callbacks: Callbacks | CallbackManager | None = None, on_tool_error: Literal['return_error', 'retry', 'abort'] = 'return_error', tool_timeout: float | None = None, validate_tool_results: bool = False, max_tool_retries: int = 1, require_approval: bool | list[str] | Callable[[str, dict[str, Any]], bool] = False, approval_handler: ApprovalHandler | AsyncApprovalHandler | None = None, session: SessionStorage | None = None, pause_before: list[str] | None = None, pause_after: list[str] | None = None, deep: bool = False, subagents: list[Agent | SubAgent] | None = None, middleware: Sequence[AgentMiddleware] | None = None, response_format: Any | None = None, context_schema: type[Any] | None = None, use_longterm_memory: bool = False, workspace: str | Path | Workspace | None = None, recursion_limit: int = 50, model_kwargs = {})

Parameter	Type	Default	Description
`tools`	`list[Any] \|None`	None	List of tools (functions, LangChain tools, or MCP tools)
`provider`	`str\|None`	None	LLM provider (auto-detected if None)
`model_name`	`str\|None`	None	Model name (uses provider default if None) Agent Identity (for use as subagent):
`name`	`str\|None`	None	Agent name (required when used as a subagent)
`description`	`str\|None`	None	What this agent does (used by parent to decide delegation)
`system`	`str\|None`	None	System prompt / instructions for this agent Callbacks (observability):
`callbacks`	`Callbacks \| CallbackManager \|None`	None	Callback handler(s) for observing agent events. Receives events for LLM calls (start, end, error, tokens) and tool executions (start, end, error). Can be a single Callbacks instance or a CallbackManager.
`on_tool_error`	`Literal['return_error', 'retry', 'abort']`	'return_error'	How to handle tool execution errors:
`tool_timeout`	`float\|None`	None	Timeout in seconds per tool call (None = no timeout)
`validate_tool_results`	`bool`	False	Validate tool results match return type annotations
`max_tool_retries`	`int`	1	Max retry attempts when on_tool_error="retry" (default 1) Human Approval:
`require_approval`	`bool\|list[str] \|Callable[[str, dict[str, Any]], bool]`	False	Tools that require human approval:
`approval_handler`	`ApprovalHandler \| AsyncApprovalHandler \|None`	None	Custom approval handler function: - If None and require_approval is True, uses console prompts - Can be sync or async function taking ApprovalRequest Session & Persistence:
`session`	`SessionStorage \|None`	None	Session storage backend for conversation memory and pause/resume. Use memory() for development, postgres() for production.
`pause_before`	`list[str] \|None`	None	Tool names to pause before executing (requires session). The agent will return a SessionResult with paused=True.
`pause_after`	`list[str] \|None`	None	Tool names to pause after executing (requires session). DeepAgents Mode (autonomous multi-step tasks):
`deep`	`bool`	False	Enable DeepAgents mode for autonomous task execution. When True, the agent has built-in tools for file operations (ls, read_file, write_file, edit_file, glob, grep, execute), todo management, and subagent orchestration.
`subagents`	`list[Agent \| SubAgent] \|None`	None	List of agents for delegation. Can be Agent instances (automatically converted) or SubAgent dicts. Agent instances must have name and description set.
`middleware`	`Sequence[AgentMiddleware] \|None`	None	Additional middleware to apply to the deep agent.
`response_format`	`Any\|None`	None	Structured output format for agent responses.
`context_schema`	`type[Any] \|None`	None	Schema for the deep agent context.
`use_longterm_memory`	`bool`	False	Enable long-term memory (requires session with store). Workspace Configuration:
`workspace`	`str\| Path \| Workspace \|None`	None	Workspace configuration for file operations. Can be:
`recursion_limit`	`int`	50	Maximum number of agent iterations (default: 50). Prevents infinite loops when agent keeps calling tools without making progress. This is a critical safety measure to prevent runaway token costs. Raise only if you have monitoring in place.
`model_kwargs`	`Any`	{}	—

Methods

def get_weather(city: str) -> str: '''Get weather for a city.''' return f"Weather in {city}: Sunny, 72°F" # Simple usage with tools agent = Agent(tools=[get_weather]) result = agent.run("What's the weather in NYC?")

from ai_infra.llm.session import memory agent = Agent(tools=[...], session=memory()) # Conversation 1 - remembered agent.run("I'm Bob", session_id="user-123") agent.run("What's my name?", session_id="user-123") # Knows "Bob" # Different session - fresh start agent.run("What's my name?", session_id="user-456") # Doesn't know

from ai_infra.llm.session import memory agent = Agent( tools=[dangerous_tool], session=memory(), pause_before=["dangerous_tool"], # Pause before this tool ) result = agent.run("Delete file.txt", session_id="task-1") if result.paused: # Show user what's pending, get approval print(result.pending_action) # Resume with decision result = agent.resume(session_id="task-1", approved=True)

from ai_infra.llm import Agent from ai_infra.llm.session import memory # Define specialized agents researcher = Agent( name="researcher", description="Searches and analyzes code", system="You are a code research assistant.", tools=[search_codebase], ) writer = Agent( name="writer", description="Writes and edits documentation", system="You are a technical writer.", ) # Create a deep agent that can delegate to subagents agent = Agent( deep=True, session=memory(), subagents=[researcher, writer], # Agents auto-convert to subagents ) # The agent can now autonomously: # - Read/write/edit files # - Execute shell commands # - Delegate to subagents # - Maintain todo lists result = agent.run("Refactor the auth module to use JWT tokens")

Parameter

Type

Default

Description

tools

list[Any] |None

None

List of tools (functions, LangChain tools, or MCP tools)

provider

str|None

None

LLM provider (auto-detected if None)

model_name

str|None

None

Model name (uses provider default if None) Agent Identity (for use as subagent):

name

str|None

None

Agent name (required when used as a subagent)

description

str|None

None

What this agent does (used by parent to decide delegation)

system

str|None

None

System prompt / instructions for this agent Callbacks (observability):

callbacks

Callbacks | CallbackManager |None

None

Callback handler(s) for observing agent events. Receives events for LLM calls (start, end, error, tokens) and tool executions (start, end, error). Can be a single Callbacks instance or a CallbackManager.

on_tool_error

Literal['return_error', 'retry', 'abort']

'return_error'

How to handle tool execution errors:

tool_timeout

float|None

None

Timeout in seconds per tool call (None = no timeout)

validate_tool_results

bool

False

Validate tool results match return type annotations

max_tool_retries

int

Max retry attempts when on_tool_error="retry" (default 1) Human Approval:

require_approval

bool|list[str] |Callable[[str, dict[str, Any]], bool]

False

Tools that require human approval:

approval_handler

ApprovalHandler | AsyncApprovalHandler |None

None

Custom approval handler function: - If None and require_approval is True, uses console prompts - Can be sync or async function taking ApprovalRequest Session & Persistence:

session

SessionStorage |None

None

Session storage backend for conversation memory and pause/resume. Use memory() for development, postgres() for production.

pause_before

list[str] |None

None

Tool names to pause before executing (requires session). The agent will return a SessionResult with paused=True.

pause_after

list[str] |None

None

Tool names to pause after executing (requires session). DeepAgents Mode (autonomous multi-step tasks):

deep

bool

False

Enable DeepAgents mode for autonomous task execution. When True, the agent has built-in tools for file operations (ls, read_file, write_file, edit_file, glob, grep, execute), todo management, and subagent orchestration.

subagents

list[Agent | SubAgent] |None

None

List of agents for delegation. Can be Agent instances (automatically converted) or SubAgent dicts. Agent instances must have name and description set.

middleware

Sequence[AgentMiddleware] |None

None

Additional middleware to apply to the deep agent.

response_format

Any|None

None

Structured output format for agent responses.

context_schema

type[Any] |None

None

Schema for the deep agent context.

use_longterm_memory

bool

False

Enable long-term memory (requires session with store). Workspace Configuration:

workspace

str| Path | Workspace |None

None

Workspace configuration for file operations. Can be:

recursion_limit

int

Maximum number of agent iterations (default: 50). Prevents infinite loops when agent keeps calling tools without making progress. This is a critical safety measure to prevent runaway token costs. Raise only if you have monitoring in place.

model_kwargs

Any

{}

—