Cursor SDK Review: Building AI Agents With Known Limitations
Cursor's new SDK exposes the same agent runtime that powers the editor. We break down what ships, where the documentation lags, and when the limitations matter for production code.
Cursor’s new SDK lets you write agents that run on the same engine powering the editor’s autocomplete and chat. That’s the pitch. The reality, based on early developer reactions, is more measured: the runtime works, the primitives are real, but the gaps are visible enough that you’ll want to know what you’re walking into before you commit a project to it.
We spent time pulling apart what the SDK actually does, where developers are hitting walls, and which workflows are realistic today versus which need to wait for the next few releases.
What the Cursor SDK Actually Ships
The headline feature is access to the same agent runtime Cursor uses internally. When you call the SDK, you’re not bolting onto a generic LLM client — you’re using the loop that handles file edits, tool calls, and multi-step reasoning inside the editor. For developers building custom agent workflows, that matters. You inherit the patterns Cursor has already debugged: how to scope edits, how to verify changes, how to keep an agent from looping off into ten useless steps.
The SDK exposes the core agent loop programmatically. You can spin up an agent with a prompt, give it access to tools, and let it run. The tooling story includes file system access, shell commands, and the same model-routing logic the editor uses to pick between Claude, GPT, and other providers depending on the task.
Where it gets interesting is what this unlocks outside the editor. Teams have started experimenting with agents that run on CI to fix lint errors, agents that triage incoming GitHub issues by reading the codebase, and headless agents that answer repo questions in Slack. Anything you could imagine wanting Cursor to do but didn’t want to do from inside the editor — that’s the rough scope.
The Limitations Developers Are Hitting
The early feedback is consistent on a few points, and they’re worth taking seriously before you build production workflows on top of the SDK.
Documentation lags the surface area. This is the most common complaint. The SDK ships with reference docs, but the patterns for non-trivial use cases — long-running agents, multi-agent orchestration, custom tool registration — are sparse. Developers report figuring out the right shape of things by reading source, hitting errors, and adjusting. That’s workable for prototypes; it’s harder to justify for production code that needs to survive an SDK version bump.
Pricing model uncertainty for SDK calls. Running agents via the SDK consumes Cursor’s backend, and how that gets billed at scale is not fully spelled out for high-volume use. Teams running thousands of agent invocations a day are asking the questions you’d expect — flat-rate versus usage-based, what counts as a billable step, how this interacts with existing Cursor seats. Until the pricing matrix is clearer, capacity planning is a guess.
Statefulness is limited. Agents in the SDK are largely stateless between runs. If you want an agent that remembers prior conversations with a user, learns from past edits, or maintains long-running context, you’re building that layer yourself on top. The runtime is happy to be embedded in a stateful system, but it doesn’t provide one.
Error surfaces are coarse. When an agent fails, the failure modes you get back are not always granular enough to retry intelligently. A timeout reads the same as a tool error reads the same as a model refusal in some paths. Developers building reliable systems are adding their own observability layers on top.
When the SDK Is the Right Choice
Despite the gaps, there are workflows where the SDK is the strongest option you have right now.
If you already pay for Cursor and your team has built intuition around how its agents behave, the SDK lets you take that intuition outside the editor with no translation cost. That’s a real advantage over wiring up an agent loop from scratch in your own framework. You skip the part where you spend two weeks getting tool calls and file edits to behave reasonably.
If you’re building developer-facing automation — issue triage, code review bots, automated refactors, doc generation — the agent loop is the hard part, and the SDK gives it to you. The limitations above hurt less in batch-processing contexts where you can retry, log, and intervene manually.
If you’re building consumer-facing AI features or anything mission-critical, the limitations probably still outweigh the benefits. Build on a lower-level SDK with explicit state management and observability, and use Cursor as your IDE rather than your agent runtime.
Cursor
The AI-first code editor whose SDK now exposes the same agent runtime to developers building custom workflows. Most useful if your team already uses Cursor and wants its agent behavior outside the editor.
Free tier; Pro at $20/mo
Affiliate link · We earn a commission at no cost to you.
How It Compares to Building From Scratch
The implicit competition for the Cursor SDK is rolling your own agent loop on top of Claude or GPT directly. That route gives you total control: state, observability, prompts, tool definitions, retry logic. It also means you own all of it. The first month of any homegrown agent project is mostly fighting the same battles Cursor has already fought.
The SDK trades that control for a working starting point. You give up the ability to tune every detail of the loop; you get back the time you’d otherwise spend tuning it. Whether that trade is worth it depends on how much your agent’s behavior needs to diverge from the patterns Cursor has standardized on.
For most teams shipping an internal tool — not a product — the SDK gets you to a working agent faster than building from primitives. For teams shipping AI as a product, the answer is usually still to build it yourself and treat Cursor as one of several reference implementations.