SessionSummarizer (sync) and AsyncSessionSummarizer (async) solve the context-window problem for long user sessions.
Instead of accumulating hundreds of raw actions that grow unboundedly, the summarizer condenses them into a compact prose paragraph every N actions — keeping your RAG context small and retrieval meaningful.
How it works
actions 1-10 → your LLM → "User explored the dashboard, filtered reports by date..." → on_summary
actions 11-20 → your LLM → "User exported a CSV and opened billing settings..." → on_summary
actions 21-30 → your LLM → "User invited a teammate and updated their profile..." → on_summary
Each group of N actions is summarised independently. The on_summary callback fires after each summarisation — wire it to your vector store, database, or RagPipeline.
Sync usage
import openai
from autoplay_sdk import ConnectorClient
from autoplay_sdk.summarizer import SessionSummarizer
openai_client = openai.OpenAI()
def my_llm(prompt: str) -> str:
return openai_client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": prompt}],
).choices[0].message.content
def store_summary(session_id: str, summary: str) -> None:
print(f"[{session_id}] {summary}")
# save to DB, vector store, etc.
summarizer = SessionSummarizer(
llm=my_llm,
threshold=10, # fire after 10 accumulated actions
on_summary=store_summary,
)
ConnectorClient(url=URL, token=TOKEN) \
.on_actions(summarizer.add) \
.run()
Async usage
from autoplay_sdk import AsyncConnectorClient
from autoplay_sdk.summarizer import AsyncSessionSummarizer
async def my_llm(prompt: str) -> str:
r = await async_openai.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": prompt}],
)
return r.choices[0].message.content
async def store_summary(session_id: str, summary: str) -> None:
await db.upsert(session_id, summary)
summarizer = AsyncSessionSummarizer(
llm=my_llm,
threshold=10,
on_summary=store_summary,
)
async with AsyncConnectorClient(url=URL, token=TOKEN) as client:
client.on_actions(summarizer.add)
await client.run()
Custom prompt
By default the summarizer uses a built-in prompt optimised for RAG pipelines.
You can override it with your own — just include {actions} as the placeholder:
MY_PROMPT = """
You are a support assistant. Summarise what this user was trying to do
so a support agent can quickly understand their context.
Actions:
{actions}
Summary:
"""
summarizer = SessionSummarizer(
llm=my_llm,
threshold=10,
prompt=MY_PROMPT,
on_summary=store_summary,
)
Default prompt:
You are summarising a user's in-app session for a RAG pipeline.
Your summary will be embedded and stored in a vector database to provide
context for a chatbot or AI assistant.
Write a concise 2-3 sentence summary of what the user did, focusing on:
- Which features or pages they visited
- What actions they performed
- Any clear intent or goal you can infer
Actions:
{actions}
Summary:
Get partial context
Read what’s been accumulated for a session before the threshold is reached:
# Returns a formatted action list string (not yet summarised)
context = summarizer.get_context("ps_abc123")
print(context)
# "1. Viewed Dashboard — https://app.example.com/dashboard
# 2. Clicked Export CSV button — ..."
Useful for including real-time context in a chatbot response before the full summary fires.
Composing with RagPipeline
The most common pattern — attach the summarizer to a RagPipeline so summaries are automatically embedded and upserted:
from autoplay_sdk.rag import RagPipeline
from autoplay_sdk.summarizer import SessionSummarizer
summarizer = SessionSummarizer(llm=my_llm, threshold=10)
pipeline = RagPipeline(
embed=embed_fn,
upsert=upsert_fn,
summarizer=summarizer,
)
ConnectorClient(url=URL, token=TOKEN) \
.on_actions(pipeline.on_actions) \
.on_summary(pipeline.on_summary) \
.run()
Constructor
SessionSummarizer(llm, threshold=10, prompt=None, on_summary=None)
llm
Callable[[str], str]
required
Any synchronous LLM callable. Receives the formatted prompt and must return the summary as a plain string.
Number of individual actions (not batches) to accumulate before triggering summarisation.
A batch of 3 actions counts as 3 toward the threshold.
Custom prompt template. Must contain {actions} as a placeholder. Uses the built-in default if not provided.
on_summary
Callable[[str, str], None] | None
default:"None"
Called after each summarisation with (session_id, summary_text). Can also be set after construction by assigning to summarizer.on_summary.
AsyncSessionSummarizer(llm, threshold=10, prompt=None, on_summary=None)
Same parameters but llm is async (prompt: str) -> str and on_summary can be sync or async.
API reference
| Method / Property | Description |
|---|
.add(payload) | Receive an actions batch — wire to on_actions |
.get_context(session_id) | Return accumulated (not-yet-summarised) actions as text |
.reset(session_id) | Clear a session’s history without summarising |
.active_sessions | List of session IDs with pending actions |
.on_summary | Assignable callback (session_id, summary) -> None |