Managing conversation history in a Temporal AI agent

TL;DR Temporal's 50MB event history limit and 51,200-event cap become real constraints for long-running conversational agents. The continue-as-new pattern, resets the event log mid-conversation, making older messages impossible to query. And Temporal's event history is an execution log, extracting a conversation transcript from it is complex and fragile. Long-running AI conversations work best when conversation storage is separate from workflow execution, with a delivery layer surfacing new messages to the frontend as they arrive.

Temporal is a strong foundation for multi-turn AI agents. A workflow can span a full conversation, handling each user message as a signal and maintaining context across arbitrary pauses between messages. The execution durability and state management are genuinely useful. Managing the conversation transcript is a separate concern.

Copy link to clipboard

What Temporal's event history stores

Temporal's event log records execution events: activities started and completed, signals received, timers fired, workflow state transitions. Each event includes its payload: the inputs and outputs of activities, the content of signals.

A conversation agent built on Temporal has the conversation content spread across these events. A user message arrives as a signal payload. The agent's response is an activity output. To reconstruct the conversation transcript from the event log, you'd need to filter the relevant events, extract their payloads, and order them correctly.

This is technically possible but fragile. Event schemas change across deploys. Signal payloads don't have a stable shape across different versions of the workflow. Activity outputs include execution metadata alongside the content you care about. The event log is designed for observability and replay of execution, not for rendering a chat interface.

Copy link to clipboard

The 50MB event history limit

Temporal's event history has a hard limit of 50MB per workflow run, with a separate cap of 51,200 events. For conversational agents, both limits are reachable.

A ten-turn conversation with substantial context (system prompt, history, and tool call results) can approach 40KB of raw message content per workflow run. Temporal's event envelope overhead multiplies the raw payload significantly. A developer who built a polling workflow that stored frequent state updates found the 5,000-event limit (now raised to 51,200) exhausted in approximately three days under normal usage.

Longer conversations, or agents that make multiple tool calls per turn, hit these limits faster. The consequences are not graceful: Temporal refuses new events once the limit is reached, and the workflow must close or migrate.

Copy link to clipboard

The continue-as-new pattern and what it breaks

The standard solution for long-running workflows approaching event history limits is continue-as-new. The workflow closes its current execution, carries forward only the state it needs, and starts a new execution. Temporal treats this as a clean handoff between executions. The conversation transcript doesn't follow the same logic, though: it creates a split. The original workflow ID no longer reflects the full conversation. Querying the original workflow returns the state at the point of the continue-as-new, not the full transcript. The new workflow ID has no history before the point of continuation.

from temporalio import workflow
from temporalio.exceptions import ContinueAsNewError

@workflow.defn
class ConversationWorkflow:
    def __init__(self) -> None:
        self.messages = []
        self.event_count = 0

    @workflow.run
    async def run(self, history: list) -> None:
        self.messages = history
        while True:
            await workflow.wait_condition(lambda: len(self.pending_signals) > 0)
            message = self.pending_signals.pop(0)
            response = await workflow.execute_activity(
                call_llm,
                {"history": self.messages, "message": message},
                start_to_close_timeout=timedelta(seconds=30),
            )
            self.messages.append({"role": "user", "content": message})
            self.messages.append({"role": "assistant", "content": response})
            self.event_count += 1

            # Continue-as-new before hitting event history limits
            if self.event_count >= 100:
                raise ContinueAsNewError(self.messages)

The messages list carries forward, but the event history does not. A client trying to render the full conversation by querying workflow history will see only the current execution's events.

Copy link to clipboard

Storing conversation history outside the workflow

The approach that scales is to keep conversation history out of Temporal's event log entirely. The workflow carries just enough context to continue the conversation. The full transcript lives in the delivery layer alongside the agent, not in workflow events.

This keeps the event history lean. Each workflow execution stays well within the 50MB limit, and continue-as-new becomes a clean migration between executions rather than a continuity problem to solve.

Copy link to clipboard

Surfacing new messages to the frontend

Agents and users connect through a persistent session keyed to the conversation ID. Each time the agent produces a response, the activity publishes to that session. The frontend subscribes and receives new messages as they complete.

Because history lives in the session itself, a user opening the chat for the first time gets the full prior context. A user who dropped and reconnected resumes exactly where they left off. The workflow doesn't need to know about either.

When the workflow continues-as-new mid-conversation, the session is unaffected. It's keyed to the conversation ID, not the workflow execution ID, so the user's browser stays subscribed and keeps receiving messages without knowing a new execution has started.

Separating these concerns is what makes the architecture hold up under real conditions. A workflow that continues-as-new doesn't break the conversation. A client that disconnects and reconnects doesn't lose history. Each piece can fail and recover without the others needing to know.

Ably AI Transport provides a delivery layer for Temporal conversational agents, pushing new messages to the frontend in real time across workflow executions. Visit the Ably AI Transport overview (ably.com/ai-transport), read the documentation, or sign up free to start building. For a broader overview of why Temporal workflows need a frontend delivery layer, see ably.com/topic/temporal-realtime-transport.

Managing conversation history in a Temporal AI agent

What Temporal's event history stores

The 50MB event history limit

The continue-as-new pattern and what it breaks

Storing conversation history outside the workflow

Surfacing new messages to the frontend

Recommended Articles

Redis pub/sub limitations for Temporal frontend delivery

Delivering Temporal workflow output to multiple devices

Why Temporal workflows need a frontend delivery layer

Join the Ably newsletter today