Debugging Occasional ECONNRESET Errors in Node.js: Root Causes and Fixes
ECONNRESET in Node.js usually traces to an idle connection closed by a load balancer or proxy while your keep-alive pool still holds it. Here is how to find the real cause and fix it.
It passes every test on your laptop. It passes in staging. Then production logs a handful of Error: read ECONNRESET lines a day — never the same endpoint twice, never reproducible on demand. You add a retry, the count drops, and you move on without knowing what happened.
That is the worst outcome, because the error is usually telling you something specific about how your connections are managed. Here is what ECONNRESET means, why it clusters around idle connections, and the changes that actually stop it.
What ECONNRESET actually means
ECONNRESET means the peer sent a TCP RST packet. The connection was open and working, and then the other side discarded its half and told your kernel to stop using it. Node surfaces this as an error with code: 'ECONNRESET' and, often, syscall: 'read' — you were waiting to read a response and the socket died underneath you.
It is not the same as the two errors people confuse it with:
| Error | What happened |
|---|---|
ECONNREFUSED | Nothing accepted the connection — wrong port, or the service is down |
ETIMEDOUT | The connection or response never completed within the allowed time |
ECONNRESET | The connection was established, then the peer abruptly killed it |
The word that matters is abruptly. Something on the other end was fine a moment ago and then was not. That narrows the search: you are not looking for a server that is down, you are looking for a connection that got closed while you still thought you owned it.
Why it is almost always an idle connection
Modern Node HTTP clients reuse connections. Node’s http.Agent with keepAlive: true — the default behavior behind globalThis.fetch and undici in Node 18+ — keeps TCP sockets in a pool after each response so the next request skips the TCP and TLS handshake. That is good for latency. It is also the most common source of intermittent resets.
A pooled socket can be closed by the other side while it sits idle in your pool. You will not find out until you write the next request onto it. Your client believes the socket is alive; the server or load balancer already sent a FIN or RST; you push bytes into a dead connection and the reset comes back as ECONNRESET.
Almost everything between you and the origin closes idle connections on a timer:
- AWS Application Load Balancers default to a 60-second idle timeout.
- nginx defaults
keepalive_timeoutto 75 seconds. - A Node origin server’s
server.keepAliveTimeoutdefaults to 5000 ms.
That gives you a rule that prevents the race: the side that sends requests should give up an idle socket before the side that receives them. When your client holds idle sockets longer than the load balancer does, the load balancer wins the race every time and you eat resets.
The same root cause bites Node servers from the other direction. Node’s 5-second keepAliveTimeout is shorter than an ALB’s 60-second idle timeout, so the ALB keeps a socket it believes is reusable, sends a request into a connection Node already closed, and returns a 502 to your user. Same race, opposite roles.
Two more causes worth ruling in or out: a burst of ECONNRESET during a rolling deploy is expected, because in-flight connections to terminating instances get reset — if the spike lines up with a deploy timestamp, that is the explanation. And an upstream that was OOM-killed or crashed will reset every connection it held.
Fixes that hold up in production
Align your timeouts. The ordering you want is: client idle timeout is shorter than the load balancer idle timeout, which is shorter than the origin server idle timeout. For a Node server sitting behind an ALB, raise its timeouts above the load balancer’s:
const server = http.createServer(app);server.keepAliveTimeout = 65_000; // longer than the ALB 60s idle timeoutserver.headersTimeout = 66_000; // must exceed keepAliveTimeoutFor a client behind that same ALB, do the opposite — keep idle sockets for less than 60 seconds so you retire them before the load balancer can.
Retry idempotent requests once. A reset on an idle pooled socket almost always means the request never reached the application — it died on the wire before the server read it. For GET, HEAD, PUT, and DELETE, a single retry on a fresh connection is safe and clears the large majority of these errors. Be deliberate with POST: only retry when you can confirm the request never landed, or you risk a duplicate write.
Prefer undici’s pool. Node’s built-in fetch is backed by undici, whose connection pool tracks the server’s Keep-Alive response header and recycles sockets more carefully than the legacy http.Agent. If you are still on a hand-rolled agent, moving to undici removes a class of stale-socket bugs.
Tracing one reset through an agent pool, a retry wrapper, and three layers of timeout config means reading across files that rarely sit next to each other. An editor that can follow that path quickly is worth having open.
Cursor
An AI-native code editor that reads across files to trace retry logic, connection-pool configuration, and timeout settings — useful when a bug spans your HTTP client, your server setup, and your infra config.
Free tier; Pro at $20/month
Affiliate link · We earn a commission at no cost to you.
FAQ
Is ECONNRESET always a bug in my code? +
Will enabling TCP keepalive fix ECONNRESET? +
Should I retry automatically on ECONNRESET? +
Related tools
Beehiiv
Newsletter platform with built-in ad network and Boost referrals.
Try Beehiiv →
Webflow
Visual site builder with real CSS export and a CMS that scales.
Try Webflow →
Some links above are affiliate links. We may earn a commission if you sign up. See our disclosure for details.
Related reading
2026-05-20
Training an LLM in Swift: Optimizing Matrix Multiplication from Gflop/s to Tflop/s
A technical walkthrough of optimizing matrix multiplication in Swift on Apple Silicon — loop reordering, cache blocking, SIMD, multithreading, and GPU offload — and why matmul throughput sets your LLM training speed.
2026-05-18
PyPI Package Growth Surge: What the Explosion Means for Python Developers
PyPI's catalog is growing faster than ever. Here's how the surge affects supply-chain risk, dependency bloat, and what to use when you audit your tree.
2026-05-18
Supabase Review: The Open-Source Postgres Platform for AI App Backends
A measured review of Supabase — the open-source Firebase alternative built on dedicated Postgres with auth, storage, realtime, and pgvector. What holds up for AI backends, what doesn't, and where pricing and the realtime engine bite.
2026-05-18
rk3562deb Review: Can a $80 ARM Tablet Be Your Linux Dev Workstation?
We read through the rk3562deb project that converts cheap RK3562 Android tablets into Debian Linux machines. Here's what works, what doesn't, and which dev workflows actually fit.
2026-05-18
70% of Americans Oppose Local AI Data Centers: What It Means for Developers
A new poll shows roughly 70% of Americans don't want AI data centers built nearby. Here's how the resulting permitting drag will hit inference pricing, region availability, and your architecture decisions.
Get the best tools, weekly
One email every Friday. No spam, unsubscribe anytime.