Debugging Occasional ECONNRESET Errors in Node.js: Root Causes and Fixes
ECONNRESET in Node.js usually traces to an idle connection closed by a load balancer or proxy while your keep-alive pool still holds it. Here is how to find the real cause and fix it.
It passes every test on your laptop. It passes in staging. Then production logs a handful of Error: read ECONNRESET lines a day — never the same endpoint twice, never reproducible on demand. You add a retry, the count drops, and you move on without knowing what happened.
That is the worst outcome, because the error is usually telling you something specific about how your connections are managed. Here is what ECONNRESET means, why it clusters around idle connections, and the changes that actually stop it.
What ECONNRESET actually means
ECONNRESET means the peer sent a TCP RST packet. The connection was open and working, and then the other side discarded its half and told your kernel to stop using it. Node surfaces this as an error with code: 'ECONNRESET' and, often, syscall: 'read' — you were waiting to read a response and the socket died underneath you.
It is not the same as the two errors people confuse it with:
| Error | What happened |
|---|---|
ECONNREFUSED | Nothing accepted the connection — wrong port, or the service is down |
ETIMEDOUT | The connection or response never completed within the allowed time |
ECONNRESET | The connection was established, then the peer abruptly killed it |
The word that matters is abruptly. Something on the other end was fine a moment ago and then was not. That narrows the search: you are not looking for a server that is down, you are looking for a connection that got closed while you still thought you owned it.
Why it is almost always an idle connection
Modern Node HTTP clients reuse connections. Node’s http.Agent with keepAlive: true — the default behavior behind globalThis.fetch and undici in Node 18+ — keeps TCP sockets in a pool after each response so the next request skips the TCP and TLS handshake. That is good for latency. It is also the most common source of intermittent resets.
A pooled socket can be closed by the other side while it sits idle in your pool. You will not find out until you write the next request onto it. Your client believes the socket is alive; the server or load balancer already sent a FIN or RST; you push bytes into a dead connection and the reset comes back as ECONNRESET.
Almost everything between you and the origin closes idle connections on a timer:
- AWS Application Load Balancers default to a 60-second idle timeout.
- nginx defaults
keepalive_timeoutto 75 seconds. - A Node origin server’s
server.keepAliveTimeoutdefaults to 5000 ms.
That gives you a rule that prevents the race: the side that sends requests should give up an idle socket before the side that receives them. When your client holds idle sockets longer than the load balancer does, the load balancer wins the race every time and you eat resets.
The same root cause bites Node servers from the other direction. Node’s 5-second keepAliveTimeout is shorter than an ALB’s 60-second idle timeout, so the ALB keeps a socket it believes is reusable, sends a request into a connection Node already closed, and returns a 502 to your user. Same race, opposite roles.
Two more causes worth ruling in or out: a burst of ECONNRESET during a rolling deploy is expected, because in-flight connections to terminating instances get reset — if the spike lines up with a deploy timestamp, that is the explanation. And an upstream that was OOM-killed or crashed will reset every connection it held.
Fixes that hold up in production
Align your timeouts. The ordering you want is: client idle timeout is shorter than the load balancer idle timeout, which is shorter than the origin server idle timeout. For a Node server sitting behind an ALB, raise its timeouts above the load balancer’s:
const server = http.createServer(app);server.keepAliveTimeout = 65_000; // longer than the ALB 60s idle timeoutserver.headersTimeout = 66_000; // must exceed keepAliveTimeoutFor a client behind that same ALB, do the opposite — keep idle sockets for less than 60 seconds so you retire them before the load balancer can.
Retry idempotent requests once. A reset on an idle pooled socket almost always means the request never reached the application — it died on the wire before the server read it. For GET, HEAD, PUT, and DELETE, a single retry on a fresh connection is safe and clears the large majority of these errors. Be deliberate with POST: only retry when you can confirm the request never landed, or you risk a duplicate write.
Prefer undici’s pool. Node’s built-in fetch is backed by undici, whose connection pool tracks the server’s Keep-Alive response header and recycles sockets more carefully than the legacy http.Agent. If you are still on a hand-rolled agent, moving to undici removes a class of stale-socket bugs.
Tracing one reset through an agent pool, a retry wrapper, and three layers of timeout config means reading across files that rarely sit next to each other. An editor that can follow that path quickly is worth having open.
Cursor
An AI-native code editor that reads across files to trace retry logic, connection-pool configuration, and timeout settings — useful when a bug spans your HTTP client, your server setup, and your infra config.
Free tier; Pro at $20/month
Affiliate link · We earn a commission at no cost to you.
FAQ
Is ECONNRESET always a bug in my code? +
Will enabling TCP keepalive fix ECONNRESET? +
Should I retry automatically on ECONNRESET? +
Related tools
Beehiiv
Newsletter platform with built-in ad network and Boost referrals.
Try Beehiiv →
Webflow
Visual site builder with real CSS export and a CMS that scales.
Try Webflow →
Some links above are affiliate links. We may earn a commission if you sign up. See our disclosure for details.
Related reading
2026-05-26
ROCm in 2026: Why PyTorch on the RX 7900 XTX Still Falls Short for Research
A measured look at where AMD ROCm with PyTorch and PyTorch Lightning still has rough edges on the RX 7900 XTX in 2026, and what that means if you are porting CUDA training workloads.
2026-05-26
GPT-5.5 Instant vs GPT-5.3: Which of OpenAI's Three Claims Hold Up
OpenAI swapped ChatGPT's default to GPT-5.5 Instant overnight, claiming faster responses, sharper reasoning, and fewer hallucinations. We grade each claim against independent testing and show developers what to change in their API stack.
2026-05-26
OpenAI Daybreak vs Anthropic Glasswing: Identical Benchmarks, Shared Partners
OpenAI's Daybreak and Anthropic's Glasswing shipped the same week with matching cybersecurity benchmarks and overlapping enterprise partners. Here's what the convergence signals and how to evaluate either for your AppSec pipeline.
2026-05-26
Macchiato Day 2 Review: Live Token Metrics and Parallel AI Terminals
Macchiato's Day 2 release ships a live token sidebar, per-agent cost dashboard, and shortcuts for Claude Code and OpenCode. Here is what changes for developers running multiple AI agents.
2026-05-21
Concurrency, Retries, and Timeouts: Building Reliable AI Agents in TypeScript
Why Promise.race leaks model calls and billing in AI agents, and how a single-owner pattern with AbortSignal, deadline budgets, and jittered retries fixes it.
Get the best tools, weekly
One email every Friday. No spam, unsubscribe anytime.