A Coroutine Is Not a Thread: What Suspends, What Gets Scheduled, and Why It Matters
A coroutine suspends and resumes cooperatively; a thread is preempted by the OS. Here is the real difference in scheduling, memory, and parallelism — and when each one wins.
People use “coroutine” and “thread” as if they were two brands of the same thing — two ways to do many jobs at once. They overlap in purpose and almost nothing in mechanism. The single sentence that separates them: a coroutine decides when to give up control; a thread has that decision made for it by the operating system. Everything else — memory cost, parallelism, the bugs you hit — falls out of that one line.
What a coroutine actually is
A coroutine is a function that can pause partway through and hand control back to whoever called it, then resume later from exactly where it stopped, with its local variables still intact. A normal function has one exit: you call it, it runs to the end, it returns once. A coroutine has many: it can suspend at a yield or an await, let other code run, and pick up mid-body when it is resumed.
You have already met them under different names. Python’s async def, JavaScript’s async/await, Kotlin’s suspend functions, C#‘s async, and Lua’s coroutines are all the same idea. Under the hood the compiler rewrites the function into a state machine: each suspension point becomes a state, and the local variables that have to survive the pause are moved onto a small heap object. That is why a suspended coroutine is cheap — it is just that object sitting in memory, often a few hundred bytes, not a reserved stack.
The defining word is cooperative. A coroutine runs until it voluntarily suspends. Nothing interrupts it between two statements. If it never hits an await, it never gives anyone else a turn. You are in charge of yielding, which makes the control flow predictable — and, as we will see, also makes one specific mistake very easy.
How a thread is different
A thread is an operating-system construct, and the OS schedules it preemptively. A timer interrupt can stop your thread between any two machine instructions and run a different one. You do not cooperate; you get interrupted whether you like it or not. That is the whole game: cooperative suspension versus preemptive interruption.
That difference shows up as cost in three places.
Memory. Every thread carries its own call stack, and that stack is reserved up front. On Linux the default per-thread stack is 8 MB of address space (you can see it with ulimit -s and change it with pthread_attr_setstacksize). Spin up ten thousand threads and you are reserving real resources for each. A coroutine has no dedicated stack — its surviving state is that small heap object — so ten thousand suspended coroutines can live in a few megabytes total.
Switching. Handing the CPU from one thread to another means a trip into the kernel: save registers, swap stacks, update bookkeeping. It lands on the order of a microsecond. Resuming a coroutine is closer to a regular function call — nanoseconds — because it never leaves your process. Same outcome (a different piece of work runs next), wildly different price.
Parallelism. This is the one people get backwards. Threads can run truly in parallel on separate CPU cores — two threads, two cores, two things happening at the same instant. Coroutines on a single thread give you concurrency, not parallelism: they interleave, but only one runs at any moment. To get real parallelism out of coroutines you schedule them across a pool of threads — which is exactly what Go’s runtime and Kotlin’s dispatchers do underneath.
When to reach for which
The choice tracks the shape of the work, not personal preference.
Reach for coroutines when the work is I/O-bound and there is a lot of it waiting at once: thousands of open sockets, each idle most of the time. This is the classic C10k situation — serving ten thousand simultaneous connections. With one thread per connection you would reserve tens of gigabytes of stacks and drown the scheduler in context switches. With coroutines, each waiting connection is a cheap suspended object, and the single event loop wakes whichever one just got data. Web servers, chat backends, and API gateways live here.
Reach for threads (or separate processes) when the work is CPU-bound and you want more cores doing it. Coroutines do not add cores; running a heavy computation as a coroutine just means it hogs one thread. If you need four cores grinding through a calculation, you need four threads.
The line genuinely blurs with Go’s goroutines, which is why they confuse people. A goroutine starts with a tiny ~2 KB stack that grows on demand, and the Go runtime multiplexes many goroutines across a smaller set of OS threads. They are cooperative-ish green threads with real parallelism bolted on — the runtime adds preemption so one goroutine can no longer starve the others, the way a naive coroutine can. They are the hybrid, not a counterexample.
Most of the time you do not implement any of this by hand — you reach for the async keyword or the thread pool your language already ships. But the bugs you hit are downstream of the model, so the difference is worth holding in your head. An AI pair-programmer is genuinely useful here: ask it why your event loop stalls and a good one will point straight at the blocking call.
Cursor
An AI-native code editor that reads your whole project. Useful when you are untangling async code — it can trace where a coroutine blocks the event loop or where a thread races on shared state, with the surrounding context in view.
Free tier; Pro at $20/mo
Affiliate link · We earn a commission at no cost to you.
FAQ
Are goroutines coroutines or threads?+
Is async/await the same as multithreading?+
Do coroutines make my code faster?+
Related reading
2026-06-10
LSM-Trees vs B-Trees: The Write-Optimized Database Tradeoff
Why some databases append writes and reconcile later while others edit in place — and how that one choice shapes write throughput, read latency, and disk usage.
2026-06-10
Copy-on-Write, Explained Through fork() and Snapshots
How copy-on-write defers copying until a write actually happens — the mechanism behind fast fork(), filesystem snapshots, and database MVCC, explained with page tables and page faults.
2026-06-10
Two's Complement: How Computers Represent Negative Numbers
How two's complement encodes negative integers, why CPUs run signed and unsigned math on one adder, and the edge cases — INT_MIN, overflow, sign extension — that cause real bugs.
2026-06-10
What MVCC Is, and How Databases Let Readers and Writers Coexist
MVCC keeps multiple versions of every row so reads never block writes. Here's how Postgres implements it with xmin/xmax, why your tables bloat, and where snapshot isolation bites.
2026-06-09
What a Merkle Tree Is, and Where You've Already Seen One
A Merkle tree hashes data into a single fingerprint so you can verify any piece without downloading the whole set. Here's how it works and where it already runs in your stack.
Get the best tools, weekly
One email every Friday. No spam, unsubscribe anytime.