# Troubleshooting Common AI Transport problems and how to fix them. Each entry follows the same shape: the symptom you see, what to check to confirm the cause, and the fix to apply. Entries are ordered roughly by how often they cause support tickets, with the most common failure modes first. ## Channel namespace not configured for AI Transport **Symptom:** Clients see an empty assistant message, or a message containing only the first token. The server errors on subsequent publishes. The initial `message.create` succeeds, but the `message.append` operations that AI Transport uses to stream tokens fail with error code `93002` (`Can only update/delete/append messages on channels with mutableMessages enabled`) because the namespace does not permit appends. This is the single most common AI Transport failure. **Confirm:** - Check the server logs for error `93002` after the first token. The message reads `Can only update/delete/append messages on channels with mutableMessages enabled`. - Open the Ably dashboard for your app's settings. - Find the channel namespace your conversations live on (for example a namespace of `conversations` should have channel names like `conversations:abc`). - Check whether *Message annotations, updates, deletes, and appends* is enabled on that namespace. **Fix:** - Enable the *Message annotations, updates, deletes, and appends* rule on the namespace. See [enable message updates and deletes](https://ably.com/docs/ai-transport/concepts/authentication.md#enable-updates-deletes) for the dashboard, Control API, and CLI steps. - Note that enabling this rule causes messages to be [persisted](https://ably.com/docs/storage-history/storage.md#all-message-persistence) regardless of whether persistence is enabled on the namespace. ## Capability or token scope mismatch **Symptom:** The token authenticates, but specific operations fail. There are two common shapes: - **Channel pattern miss:** the channel attach fails, or a publish is rejected. The capability covers `conversations:*` but your app uses `chat:abc`. - **Missing operation:** the connection works but a specific operation does not. Clients cannot cancel generation (missing `publish`). Late joiners or reconnects show only live messages and not the prior conversation (missing `history`). Agent presence never updates (missing `presence`). **Confirm:** - Decode the JWT returned by your auth endpoint and inspect the `x-ably-capability` claim. It is a JSON-encoded map of channel patterns to permitted operations. - Compare each operation your application performs against the capability for the relevant channel name. **Fix:** - Make the channel pattern in the capability cover the channel names your application uses. Patterns are case-sensitive; wildcards like `conversations:*` only cover channels with that exact prefix. - Grant every operation the application needs: `subscribe` and `publish` for messages, `history` for loading past conversation, `presence` for agent presence. See [capability operations](https://ably.com/docs/auth/capabilities.md#capability-operations) for the full list and [authentication](https://ably.com/docs/ai-transport/concepts/authentication.md) for the AI Transport capability shape. ## History disappears **Symptom:** Past conversation messages are not available when the user expects them. Two scenarios trigger the same root cause: - The user opens the app the next day, tries to scroll back, and older messages are gone. - The user is offline longer than the live recovery window. On reconnect the SDK falls back to history, but no past messages arrive. In both cases the cause is the same: the channel namespace is not configured to persist messages long enough for the use case. **Confirm:** - Check the retention setting on the channel namespace in the Ably dashboard. The default in-memory retention covers only the live recovery window (around 2 minutes) and is not suitable for scroll-back. - Confirm whether your application is meant to read history from Ably alone, or to hydrate from an external store. **Fix:** - Enable persistence on the channel namespace and set a retention period that covers your expected scroll-back window. See [history and replay](https://ably.com/docs/ai-transport/features/history.md#faq-retention) for the persistence options. - For conversations that need to be retained for longer than the channel allows, persist completed turns to your own store and hydrate from it on session load. See [reconnection and recovery](https://ably.com/docs/ai-transport/features/reconnection-and-recovery.md#loading-history) for how reconnect interacts with history. ## Turn never ends **Symptom:** The streamed message renders with the streaming status forever. `useActiveTurns` shows the turn as active long after the model finished generating. **Confirm:** - Check the server logs around the affected turn. Did `turn.end(reason)` execute? If the route handler threw between `streamResponse` and `end`, the turn never closes. - Inspect the channel in the Ably dashboard. The turn-end lifecycle message should appear after the streamed message's close event. If it is missing, the encoder did not publish it. **Fix:** - Wrap streaming work in `try`/`finally` and always call `turn.end()` in the `finally` block. A turn that errored should end with reason `'error'`. - If you use Next.js `after()`, confirm the callback runs to completion. An unhandled promise rejection inside `after()` aborts the rest of the handler, including `turn.end()`. - See [turns](https://ably.com/docs/ai-transport/concepts/turns.md#lifecycle) for the full lifecycle contract. ## Cancel doesn't stop the agent **Symptom:** The client publishes a cancel signal, the cancel message lands on the channel, but the agent keeps streaming tokens until the model finishes naturally. **Confirm:** - Check the server handler: is `turn.abortSignal` passed to the LLM call? - For long-running tools, check whether the tool implementation reads `turn.abortSignal.aborted` and exits when the signal fires. - If `onCancel` is configured, check that it returns `true` for the cancel request. A hook that returns `false` rejects the cancel silently. **Fix:** - Pass `turn.abortSignal` to every LLM call, for example `streamText({ abortSignal: turn.abortSignal, ... })`. - In server-executed tools, wire `turn.abortSignal` into long-running operations so they exit promptly when the signal fires. - See [cancellation](https://ably.com/docs/ai-transport/features/cancellation.md#abort-signal) for the full flow. ## Duplicate or unexpected turns **Symptom:** A single user action produces two turns. The streamed response duplicates, or the user sees two siblings where they expected one. Two common causes: - React Strict Mode (or a stale `useEffect`) calls `send()` twice in development. - The user edits or regenerates a message while a previous turn is still streaming. The edit does not cancel the in-progress turn, so both streams run side by side. **Confirm:** - Inspect the channel for two turn-start events with different `turnId` values for the same user message. - Check whether your send path lives inside a `useEffect`. Imperative event handlers (`onClick`, `onSubmit`) are safer. **Fix:** - Guard `send()` so it fires once per user action. Avoid placing it inside an effect without a dependency that prevents re-firing. - Before editing or regenerating, cancel the in-flight turn explicitly. AI Transport does not auto-cancel an active turn on edit; see [conversation branching](https://ably.com/docs/ai-transport/features/branching.md#edit) for the recommended pattern. ## Message too large to publish **Symptom:** A publish fails with error code `40009` (maximum message length exceeded). Inside a streaming turn this surfaces as a `StreamError` (`104008`) on the server. Tool outputs or model responses that contain large payloads do not reach the channel. **Confirm:** - Identify the message that failed. Tool results that include binary data, large embeddings, or full document bodies are the usual culprits. - Check your Ably package's message size limit. The cap is 64 KiB on Free and Standard, 256 KiB on Pro and Enterprise. **Fix:** - Stream large tool results across multiple events instead of publishing them in one message. - Persist large payloads to an external store and send only a reference (URL or ID) over the channel. ## Two devices share a clientId **Symptom:** Ownership-scoped behaviour misbehaves across two devices for the same user. Cancels intended for one device cancel turns on the other. Presence shows the user appearing and disappearing as both devices update their state. **Confirm:** - Inspect the JWT each device receives. If both tokens have the same `x-ably-clientId`, they are indistinguishable to the Ably service. - Check whether your token-issuing logic generates a unique identifier per device (typically `userId + deviceId`), or only uses `userId`. **Fix:** - Assign a unique `clientId` per device or per session for any case where ownership matters. A common pattern is `:` so the user remains identifiable while devices remain distinguishable. - See [multi-device sessions](https://ably.com/docs/ai-transport/features/multi-device.md) for the model. ## Branch selection out of sync across devices **Symptom:** Two devices on the same conversation show different responses to the same prompt. Edit history navigation diverges between them. This is intentional. Branch selection is a per-view, per-device choice — each device navigates the conversation tree independently so a user reviewing alternates on one device does not disrupt another device displaying the same conversation. **Fix:** - If branch selection should be shared (for example, a co-pilot where both devices must agree), synchronise the chosen leaf through your own application state, or by sending a message on the channel that the other client can react to and change its selections. - See [conversation branching](https://ably.com/docs/ai-transport/features/branching.md#branch-navigation) for how branch selection works. ## Reconnect loop **Symptom:** The client connects, drops, reconnects, drops again. The cycle repeats every few seconds. Channel state never stabilises. **Confirm:** - Inspect the Ably connection state on the client (`realtimeClient.connection.state`). A reconnect loop cycles `connecting` → `connected` → `disconnected` rapidly. - If each `disconnected` event carries a token error, your auth endpoint is returning tokens that Ably is rejecting on every reconnect. **Fix:** - Verify the auth endpoint signs tokens with the correct key for the right environment (a dev key against a prod app is a common cause). - Lengthen token TTL if it is set very short. The SDK refreshes ahead of expiry; very short lifetimes fight the refresh. - If the token is rejected for capability rather than signing, see [capability or token scope mismatch](#capability-mismatch). ## Agent process crashes mid-stream **Symptom:** A streamed message stops part-way through and stays in `streaming` status forever. No more tokens arrive and no turn-end event ever fires. From the client's point of view this is indistinguishable from [Turn never ends](#turn-never-ends) — both leave the run in `active` state — but the root cause here is the process dying rather than the handler skipping `turn.end()`. **Confirm:** - Check server logs for the crashed process around the timestamp of the affected turn. An exception or an OOM kill typically appears there. - If you cannot see the crash directly, check infrastructure signals: serverless function timeouts that expire before `after()` finishes, container OOM kills, and process restarts are common causes. **Fix:** - Add structured error logging around the streaming path so the cause is visible. Common causes are model provider errors, OOM in long tool calls, and serverless function timeouts. - Decide on the application response: surface a retry control to the user, or auto-retry from the application layer. AI Transport does not retry the LLM call automatically. See [reconnection and recovery](https://ably.com/docs/ai-transport/features/reconnection-and-recovery.md#faq-agent-crash) for the contract. ## Publish fails in suspended state **Symptom:** A publish from the client returns an error. The connection has been disconnected for more than around two minutes and has moved to `suspended` state. **Confirm:** - Inspect the connection state at the moment of the failed publish. A `suspended` connection has lost message continuity, and there is no live connection to the Ably platform. The underlying Ably SDK (`ably-js`) rejects the publish locally because it cannot guarantee ordering against the live stream. **Fix:** - Check connection state before publishing user-driven actions, and queue them locally while the connection is not `connected`. - Flush the queue once the connection returns to `connected`. `ably-js` does not buffer publishes through a `suspended` state because continuity has already been lost. ## When to escalate If you have worked through the entries on this page and the symptom does not match — or the fix did not work — capture: - The channel name. - The date and time of the affected operation. - The `clientId` of the affected server agent or client device. - The first error log entry on either side. [Open a support ticket](https://ably.com/contact) with these. The Ably side of the system is observable to support; the application side needs the IDs to correlate. ## Related Topics - [Overview](https://ably.com/docs/ai-transport.md): AI Transport is durable session infrastructure for AI applications. Streams survive reconnects, sessions span devices, and any participant signals any other through the same session. - [Going to production](https://ably.com/docs/ai-transport/going-to-production.md): Production checklist for Ably AI Transport: limits, monitoring, auth hardening, and the pricing pointer. Shipping AI Transport responsibly. ## Documentation Index To discover additional Ably documentation: 1. Fetch [llms.txt](https://ably.com/llms.txt) for the canonical list of available pages. 2. Identify relevant URLs from that index. 3. Fetch target pages as needed. Avoid using assumed or outdated documentation paths.