SessionSummarizer - Autoplay SDK

SessionSummarizer (sync) and AsyncSessionSummarizer (async) solve the context-window problem for long user sessions. Instead of accumulating hundreds of raw actions that grow unboundedly, the summarizer condenses them into a compact prose paragraph every N actions — keeping your RAG context small and retrieval meaningful.

How it works

actions 1-10  →  your LLM  →  "User explored the dashboard, filtered reports by date..."  →  on_summary
actions 11-20 →  your LLM  →  "User exported a CSV and opened billing settings..."        →  on_summary
actions 21-30 →  your LLM  →  "User invited a teammate and updated their profile..."      →  on_summary

Each group of N actions is summarised independently. The on_summary callback fires after each summarisation — wire it to your vector store, database, or RagPipeline.

Sync usage

import openai
from autoplay_sdk import ConnectorClient
from autoplay_sdk.summarizer import SessionSummarizer

openai_client = openai.OpenAI()

def my_llm(prompt: str) -> str:
    return openai_client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": prompt}],
    ).choices[0].message.content

def store_summary(session_id: str, summary: str) -> None:
    print(f"[{session_id}] {summary}")
    # save to DB, vector store, etc.

summarizer = SessionSummarizer(
    llm=my_llm,
    threshold=10,            # fire after 10 accumulated actions
    on_summary=store_summary,
)

ConnectorClient(url=URL, token=TOKEN) \
    .on_actions(summarizer.add) \
    .run()

Async usage

from autoplay_sdk import AsyncConnectorClient
from autoplay_sdk.summarizer import AsyncSessionSummarizer

async def my_llm(prompt: str) -> str:
    r = await async_openai.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": prompt}],
    )
    return r.choices[0].message.content

async def store_summary(session_id: str, summary: str) -> None:
    await db.upsert(session_id, summary)

summarizer = AsyncSessionSummarizer(
    llm=my_llm,
    threshold=10,
    on_summary=store_summary,
)

async with AsyncConnectorClient(url=URL, token=TOKEN) as client:
    client.on_actions(summarizer.add)
    await client.run()

Custom prompt

By default the summarizer uses a built-in prompt optimised for RAG pipelines. You can override it with your own — just include {actions} as the placeholder:

MY_PROMPT = """
You are a support assistant. Summarise what this user was trying to do
so a support agent can quickly understand their context.

Actions:
{actions}

Summary:
"""

summarizer = SessionSummarizer(
    llm=my_llm,
    threshold=10,
    prompt=MY_PROMPT,
    on_summary=store_summary,
)

Default prompt:

You are summarising a user's in-app session for a RAG pipeline.
Your summary will be embedded and stored in a vector database to provide
context for a chatbot or AI assistant.

Write a concise 2-3 sentence summary of what the user did, focusing on:
- Which features or pages they visited
- What actions they performed
- Any clear intent or goal you can infer

Actions:
{actions}

Summary:

Get partial context

Read what’s been accumulated for a session before the threshold is reached:

# Returns a formatted action list string (not yet summarised)
context = summarizer.get_context("ps_abc123")
print(context)
# "1. Viewed Dashboard — https://app.example.com/dashboard
#  2. Clicked Export CSV button — ..."

Useful for including real-time context in a support AI agent response before the full summary fires.

Composing with RagPipeline

The most common pattern — attach the summarizer to a RagPipeline so summaries are automatically embedded and upserted:

from autoplay_sdk.rag import RagPipeline
from autoplay_sdk.summarizer import SessionSummarizer

summarizer = SessionSummarizer(llm=my_llm, threshold=10)

pipeline = RagPipeline(
    embed=embed_fn,
    upsert=upsert_fn,
    summarizer=summarizer,
)

ConnectorClient(url=URL, token=TOKEN) \
    .on_actions(pipeline.on_actions) \
    .on_summary(pipeline.on_summary) \
    .run()

Constructor

`SessionSummarizer(llm, threshold=10, prompt=None, on_summary=None)`

llm

Callable[[str], str]

required

Any synchronous LLM callable. Receives the formatted prompt and must return the summary as a plain string.

threshold

int

default:"10"

Number of individual actions (not batches) to accumulate before triggering summarisation. A batch of 3 actions counts as 3 toward the threshold.

prompt

str | None

default:"None"

Custom prompt template. Must contain {actions} as a placeholder. Uses the built-in default if not provided.

on_summary

Callable[[str, str], None] | None

default:"None"

Called after each summarisation with (session_id, summary_text). Can also be set after construction by assigning to summarizer.on_summary.

`AsyncSessionSummarizer(llm, threshold=10, prompt=None, on_summary=None)`

Same parameters but llm is async (prompt: str) -> str and on_summary can be sync or async.

API reference

Method / Property	Description
`.add(payload)`	Receive an actions batch — wire to `on_actions`
`.get_context(session_id)`	Return accumulated (not-yet-summarised) actions as text
`.reset(session_id)`	Clear a session’s history without summarising
`.active_sessions`	List of session IDs with pending actions
`.on_summary`	Assignable callback `(session_id, summary) -> None`

​How it works

​Sync usage

​Async usage

​Custom prompt

​Get partial context

​Composing with RagPipeline

​Constructor

​SessionSummarizer(llm, threshold=10, prompt=None, on_summary=None)

​AsyncSessionSummarizer(llm, threshold=10, prompt=None, on_summary=None)

​API reference