🎉 ai-infra v1.0 is here — Production-ready AI/LLM infrastructure
What's new
Nfrax Docs logoNfrax Docs

Infrastructure that just works. Ship products, not boilerplate.

Frameworks

  • svc-infra
  • ai-infra
  • fin-infra
  • robo-infra

Resources

  • Getting Started
  • What's New
  • Contributing

Community

  • GitHub

© 2026 nfrax. All rights reserved.

Nfrax Docs logoNfrax Docs
Start HereWhat's New
GitHub
ai-infra / API Reference

SummarizationMiddleware

from ai_infra.llm.memory import SummarizationMiddleware
View source
ai_infra.llm.memory

Middleware that auto-summarizes when conversation gets too long. Attach to an Agent to automatically compress conversation history when it approaches context limits.

python
from ai_infra import Agent
    from ai_infra.memory import SummarizationMiddleware

    agent = Agent(
        tools=[...],
        middleware=[
            SummarizationMiddleware(
                trigger_tokens=4000,  # Summarize when over 4000 tokens
                keep_messages=10,     # Always keep last 10 messages
            )
        ]
    )

Attributes

trigger_tokens: Summarize when token count exceeds this threshold trigger_messages: Summarize when message count exceeds this threshold keep_messages: Number of recent messages to always keep llm: LLM to use for summarization (uses default if None) summarize_prompt: Custom prompt template

Constructor
SummarizationMiddleware(trigger_tokens: int | None = None, trigger_messages: int | None = None, keep_messages: int = 10, llm: Any | None = None, summarize_prompt: str | None = None, _last_summary: str | None = None) -> None
ParameterTypeDefaultDescription
trigger_tokensint|NoneNone—
trigger_messagesint|NoneNone—
keep_messagesint10—
llmAny|NoneNone—
summarize_promptstr|NoneNone—
_last_summarystr|NoneNone—

Methods

On This Page

Constructoraprocessasyncprocessshould_summarize