Why Long-Running AI Agents Break on HTTP, and How Ably's Durable Sessions Fix It

An AI agent that summarizes a paragraph finishes in two seconds. An AI agent that researches a question, calls six tools, and drafts a report can run for four minutes — or forty. The first fits HTTP comfortably. The second fights it the whole way.

Most agent backends are still wired the way web apps have been wired since the 1990s: a client sends a request, the server sends a response, the connection closes. That contract holds because the response usually arrives fast enough that nobody notices the connection was open at all. Long-running agents break the contract. They produce output gradually, they outlive the patience of every proxy between client and server, and they keep working even after the user closes the tab. We dug into why this fails so often, and how Ably’s durable session model is built to absorb it.

Where HTTP runs out of road

HTTP’s request-response cycle assumes a short, bounded exchange. Three things go wrong once an agent runs for minutes instead of milliseconds.

Idle timeouts close the socket. Your connection passes through load balancers, reverse proxies, and CDNs, and each one drops connections that go quiet. An AWS Application Load Balancer closes idle connections after 60 seconds by default. An agent that reasons for 90 seconds before emitting its first token has already lost the socket underneath it.

Streaming is still one fragile pipe. Server-Sent Events and WebSockets hold the connection open and solve the timeout, which is why most agent UIs use them today. But the stream is bound to a single TCP connection. When a phone switches from Wi-Fi to cellular, a laptop sleeps, or the server is redeployed mid-task, that connection dies — and every token emitted during the gap is gone. The agent kept running on the server; the client simply stopped hearing it.

Nothing remembers what was missed. Reopen the connection and you get a fresh stream from that instant forward. HTTP gives you no way to ask which messages arrived between second 30 and second 95. The protocol has no concept of a session that outlives the socket.

What durable sessions actually mean

Ably’s approach is to stop treating the session and the connection as the same object. A durable session is a logical channel that lives on the server; the WebSocket connection is just a temporary attachment to it. Three mechanisms make that work.

Decoupled lifecycle. The agent publishes to a channel, not to a socket. The session exists whether or not a client is currently listening. The user can shut the laptop, the agent keeps running, and the messages wait on the channel.

Message persistence and replay. Every message gets an ID and is retained for a configurable window. Ably’s history and rewind features let a reconnecting client ask for everything since a given message ID and receive the gap in order — no tokens lost, no duplicates inserted.

Connection state recovery. When a client reconnects inside the recovery window — roughly two minutes by default — Ably restores the prior connection state and resumes delivery from the last message the client acknowledged. To the application, the interruption never happened.

Presence sits alongside these three: the server can see whether a human is currently attached, so an agent can decide whether to stream every token or just checkpoint its progress and notify the user later.

Patterns for infrastructure that survives a dropped connection

You don’t need Ably specifically to apply the ideas, but you do need to design for them on purpose.

Give every message a monotonic ID. Ordering and gap detection are impossible without one. The client tracks the last ID it processed, and reconnect logic replays from there.

Make the session the unit of work, not the request. Store run state — current step, tool calls, partial output — keyed by a session ID the client holds. Reconnection re-attaches to that ID; it never re-submits the prompt and never starts the agent over.

Guard every side effect. Even with clean resume logic, a tool call that fires twice should not double-charge a card or send two emails. Put an idempotency key on each external action.

Separate “the agent finished” from “the client got the result.” Persist the final output, and treat delivery as its own retryable step. An agent that completes while the user is offline should still deliver when they return.

Done together, these patterns turn a dropped connection from a lost task into a resumable one — the difference between an agent demo and an agent users trust with a forty-minute job.

Cursor

Building the agent and its reconnection layer yourself? An AI-native editor keeps the session, replay, and idempotency code in context as you write it, which is where most of the subtle bugs hide.

Free tier; Pro from $20/month

Try Cursor

Affiliate link · We earn a commission at no cost to you.

Common questions

FAQ

Can't I just use WebSockets and handle reconnection myself?

You can, and for short sessions it is fine. What you sign up for is message persistence, gap detection, ordered replay, and connection state recovery — the same problem set every durable messaging platform has already solved. Build it yourself only if your requirements are unusual enough to justify owning that code.

Does any of this matter if my agent finishes in under 30 seconds?

Much less. Below a typical 60-second proxy idle timeout, a plain streaming response usually survives intact. Durable sessions earn their place when runs are long, when clients sit on mobile networks, or when the agent must keep working after the user disconnects.

Should agent run state live in Ably or in my own database?

In your database. Ably handles message transport and delivery durability; it is not a system of record for agent state. Persist the run's step, tool history, and output yourself, then use the channel to stream and replay updates about it.

Why Long-Running AI Agents Break on HTTP, and How Ably's Durable Sessions Fix It

Where HTTP runs out of road

What durable sessions actually mean

Patterns for infrastructure that survives a dropped connection

Cursor

Common questions

FAQ

Caddy vs Nginx in 2026: When Automatic HTTPS Is Worth the Switch

Hetzner vs OVH for Side Projects: Bare-Metal Value in 2026

Bun vs Node.js in Production: What Actually Changes in 2026

Coolify vs Dokploy: Self-Hosted PaaS for Solo Developers in 2026

Turso vs Neon: Serverless SQLite and Postgres Compared in 2026

Get the best tools, weekly