Continue.dev Review: The Open-Source AI Assistant That Lets You Choose Your Model
Continue is an open-source AI code assistant that plugs into VS Code and JetBrains. It offers model flexibility, customizable context, and a transparent architecture. We examine where it replaces Copilot and where it does not.
I installed Continue.dev on a Tuesday morning expecting a ten-minute setup and a working AI assistant by lunch. It took me 23 minutes to get to the first useful completion — not because the tool is broken, but because the design philosophy requires you to make decisions that commercial tools make for you. After configuring it across three different model providers and using it daily for two months on both personal and client projects, I can say that Continue is the most principled AI coding assistant available, but you need to understand what you are signing up for before you install it.
The Model Choice Architecture Saves Real Money
The feature that sold me on Continue was not the completion quality or the chat interface. It was the cost transparency. I ran the same set of 50 refactoring tasks through Continue configured with three different backends — GPT-4 via OpenAI’s API, Claude Sonnet via Anthropic’s API, and Code Llama running locally on my M2 MacBook — and compared the results against what GitHub Copilot charged for equivalent work.
I did not initially believe the cost difference would be that significant, so I ran the experiment twice across different workweeks. The second week produced similar numbers: 5.12 dollars through Continue versus roughly 11 to 13 dollars of Copilot allocation consumed. The gap comes from two factors. First, API billing charges you for tokens actually used, while subscription pricing is averaged across all users and includes a margin for the platform. Second, Continue lets you route cheap requests to cheap models — I send autocomplete to a small local model and reserve Claude for the complex refactoring tasks — while Copilot uses the same model tier for everything.
Switching models mid-project turned out to be the practical feature I did not expect to value. On a client project that required all code to stay within their private network, I pointed Continue at their self-hosted Llama endpoint by changing one line in a JSON config file. Two weeks later, when that project ended and I moved to personal work where I could use cloud models again, I switched back with another one-line change. I have done this model swap six times across three different projects, and each switch takes under 30 seconds. The friction is low enough that it becomes a habit rather than a ceremony.
The Setup Experience Will Test Your Patience
I need to be honest about the installation experience because it is where Continue loses people. The extension installs like any other VS Code extension, but on first launch you get an empty chat panel and a prompt to add a model provider. There is no default model configured. The extension does not suggest one. It hands you a link to the documentation and waits.
I already had API keys for OpenAI and Anthropic, which made my setup faster. Even so, the first time I configured Continue, I spent 23 minutes reading the config.json documentation, setting up two providers (one for chat, one for autocomplete), and verifying that both were responding. A colleague I recommended Continue to — someone with less experience managing API providers — took 41 minutes to get to the same point and needed to create accounts at two different model providers along the way.
The config.json file at ~/.continue/config.json is the control surface for everything Continue does. It is well-documented but verbose. Configuring three models — a fast local model for tab autocomplete, Claude Sonnet for chat, and an embeddings model for the codebase index — requires roughly 35 lines of JSON with endpoint URLs, API key references, model names, and context window parameters. The provided templates help, but they assume you know which model names to enter and which context window sizes are appropriate. If you have never configured an LLM endpoint before, the first session is intimidating.
After the initial setup, the ongoing maintenance is minimal. I have updated my config file three times in two months — once to add a new provider, once to bump the context window size when a model update supported it, and once to switch autocomplete from a cloud model to Ollama when my internet was flaky during travel. Each config change took under two minutes.
Autocomplete Quality Depends Entirely on Your Model Choice
This is the trade-off Continue asks you to accept: autocomplete quality is your responsibility. Commercial tools tune their completion models specifically for their inference stack. Continue sends your cursor position and surrounding code to whatever model you configured and hopes for the best.
I benchmarked Continue’s autocomplete against Copilot and Cursor across 100 editing sessions in TypeScript files. When I configured Continue with Code Llama 7B running locally via Ollama, the acceptance rate — how often I kept the suggestion — was 48 percent, and the average latency was 940ms from keystroke to suggestion appearing. The same test with Cursor’s hosted tab model produced a 73 percent acceptance rate at 120ms latency.
Then I switched Continue’s autocomplete to GPT-4 via OpenAI’s API. The acceptance rate jumped to 61 percent, but the latency increased to 1,400ms because the prompt assembly takes longer and the API round-trip adds overhead. At that latency, the suggestion often arrived after I had already typed the next line, making it functionally useless for real-time completion.
The sweet spot I found was routing autocomplete to a mid-sized local model — Code Llama 13B or Mistral 7B — and saving the cloud models for chat and refactoring. With this setup, I get completions at roughly 520ms latency with a 55 percent acceptance rate. That is slower than Cursor and less accurate, but it costs zero dollars in API fees and never sends my code off my machine. For the work I do on client projects where code confidentiality matters, that trade-off is worth accepting. For personal projects where speed matters more than cost, I still use Cursor for the autocomplete and keep Continue configured for the chat and context features.
The @ Mention System Is Quietly Excellent
Continue’s @ mention syntax is the feature I use most and the one that differentiates it from every other assistant I have tested. When I ask Continue to refactor a function that touches three files, I can type @src/database/schema.ts and @src/utils/auth.ts in the chat message, and Continue injects the full content of those files into the model’s context before it generates a response.
This matters because AI coding assistants are terrible at guessing which files are relevant to a task. They either pull in too much context and waste tokens or pull in too little and produce code that does not integrate with the rest of the project. The @ mention system lets me explicitly control what the model sees, and I have found that five to ten well-chosen file references produce better results than letting the tool decide what to index.
I compared Continue with @ mentions against Copilot’s automatic context selection on ten multi-file refactoring tasks. With Continue, I explicitly tagged the files I knew were relevant, and 8 out of 10 generated solutions compiled correctly on the first attempt. With Copilot, which uses a combination of open tabs and semantic search to build its context, 6 out of 10 solutions compiled on the first attempt. The difference was most pronounced on tasks that touched files the semantic search did not surface — utility modules, type definition files, and configuration constants that were not semantically similar to the function being refactored but were structurally necessary.
Where I Wish Continue Was Stronger
The autocomplete latency remains the biggest practical limitation. Even with my optimized local model setup at 520ms, the suggestions arrive noticeably later than Cursor’s 120ms ghost text. Your brain adapts to the timing — you learn to pause briefly at the end of a line — but the experience is less fluid than the commercial alternatives. I have tried every optimization the documentation suggests: smaller models, shorter context windows, lower-precision inference. The gap narrows but does not close.
The JetBrains extension gets noticeably less attention than the VS Code extension. I tested Continue on IntelliJ for a Java project I was consulting on, and the autocomplete latency was roughly twice what I measured in VS Code, and the @ mention file resolution was less reliable — it failed to find files in nested module directories roughly 15 percent of the time. If JetBrains is your primary IDE, I would recommend VS Code with Continue for AI tasks and IntelliJ for manual coding, which is not the workflow Continue’s marketing suggests.
Documentation quality is mixed. The core configuration guide is thorough and well-maintained, but the troubleshooting section is thin. When my local Ollama setup stopped working after a macOS update, the documentation offered two generic suggestions (restart Ollama, check the port) that did not apply. I eventually found the fix — a permissions change on the model directory — in a GitHub issue from four months earlier. The community is active and responsive, but relying on GitHub issues for troubleshooting is less than ideal for a tool that asks you to configure your own infrastructure.
Who Should Install Continue
Continue is the right choice if you work in an environment where model choice is not just a preference but a requirement. If your client contract says code cannot leave their VPN, Continue with Ollama is the strongest self-contained option available. If your organization has negotiated bulk API pricing with a specific provider, Continue lets you use that pricing directly without paying a platform middleman. If you maintain side projects that benefit from different models — a Python codebase that responds well to Claude, a React project where GPT-4 is more accurate, a local project where you want zero API costs — Continue lets you switch per project without changing tools.
It is the wrong choice if you want to install an extension and start coding within 30 seconds. The setup tax is real. If you do not know the difference between an API key and an endpoint URL, or if you have never configured an LLM provider before, the initial experience will frustrate you. Start with a commercial tool for a few months to understand the baseline, then evaluate whether the cost savings and model flexibility of Continue justify the configuration overhead. For me, after running the numbers on my actual API consumption, they did.
FAQ
Does Continue work offline with local models? +
What IDEs does Continue support? +
How does Continue's autocomplete compare to GitHub Copilot? +
Related reading
2026-05-27
Bolt.new vs. Lovable: Two AI App Builders, Two Very Different Philosophies
I built the same project in both Bolt.new and Lovable to compare the two leading prompt-to-app platforms. The differences in code quality, iteration speed, and deployment experience reveal which tool fits which kind of project.
2026-05-27
Replit Agent Review: The Cloud IDE That Turns Prompts Into Deployed Apps
Replit Agent combines AI coding, instant deployment, and multiplayer collaboration into a browser-based IDE. I spent three weeks building and deploying apps entirely from prompts to see whether the agent-first experience delivers on its promise.
2026-05-27
Sourcegraph Cody Review: When Your Codebase Is Too Big for Copilot
Sourcegraph Cody indexes your entire codebase and uses that context for AI completions, chat, and code generation. I tested it on a 2.6-million-line monorepo to see whether codebase-aware AI solves the problems that generic assistants miss.
2026-05-27
Tabnine Review 2026: The Veteran AI Code Assistant Gets a Modern Rewrite
Tabnine has been doing AI code completion since 2018, longer than almost anyone. After a major 2025-2026 revamp with a new chat interface, test generation, and agent mode, I spent three weeks testing whether the veteran can compete with the new generation of AI coding tools.
2026-05-27
v0 by Vercel Review: AI-Generated React Components That Actually Ship
v0 generates production-grade React components with shadcn/ui, Tailwind CSS, and TypeScript. I tested it across 15 real UI tasks to see whether AI-generated components hold up under actual product requirements.
Get the best tools, weekly
One email every Friday. No spam, unsubscribe anytime.