from ai_infra.llm.memory import SummarizationMiddlewareMiddleware that auto-summarizes when conversation gets too long. Attach to an Agent to automatically compress conversation history when it approaches context limits.
from ai_infra import Agent
from ai_infra.memory import SummarizationMiddleware
agent = Agent(
tools=[...],
middleware=[
SummarizationMiddleware(
trigger_tokens=4000, # Summarize when over 4000 tokens
keep_messages=10, # Always keep last 10 messages
)
]
)trigger_tokens: Summarize when token count exceeds this threshold trigger_messages: Summarize when message count exceeds this threshold keep_messages: Number of recent messages to always keep llm: LLM to use for summarization (uses default if None) summarize_prompt: Custom prompt template