Sourcegraph Cody Review: When Your Codebase Is Too Big for Copilot

I asked Copilot to add a new payment method to our billing service. It generated a plausible-looking Paypal integration function that referenced a PaymentGateway interface and four helper utilities. The code was clean TypeScript. It was also completely wrong — it reinvented an interface that already existed in a shared library three directories up the tree, and the four helper functions it called were invented whole-cloth because Copilot could not see that we already had equivalent utilities in src/shared/billing/.

This is the class of failure that Sourcegraph Cody is designed to prevent. Instead of guessing from your open files and cursor position, Cody indexes your entire repository and uses that index as retrieval-augmented context for every AI interaction. I spent two weeks running Cody against a real 2.6-million-line TypeScript monorepo to understand whether codebase-aware context meaningfully improves AI coding reliability.

How Cody’s Code Graph Works

Cody does not just grep your codebase. It builds a search index that understands symbol relationships — which functions call which, which interfaces are implemented where, which types flow through which modules. When you ask Cody a question or trigger a completion, it queries this index to find the most relevant code, incorporates that code as context, and sends the enriched prompt to the LLM.

The initial indexing takes time. On the 2.6-million-line monorepo I tested with, the indexer ran for about 14 minutes on first setup, processing roughly 11,000 TypeScript files. Subsequent incremental updates ran in under 30 seconds as files changed. The index footprint was about 450 MB on disk, which is reasonable for a codebase of this size and substantially less than the repository itself.

The practical effect is that Cody’s answers reference real code. When I asked how authentication works in the monorepo, Cody traced the middleware chain through five files, identified the JWT verification module, and explained the role-based access pattern — all with inline references to specific functions and line numbers. Copilot, given the same question, summarized a generic JWT flow that happened to be directionally correct but missed the custom policy engine that sat between the middleware and the route handlers. The difference is the difference between an answer that sounds right and an answer that is right.

Enterprise Features: Multi-Repo and On-Premise

Cody’s enterprise tier adds two features that matter specifically to large organizations: multi-repository search and on-premise deployment.

Multi-repo search lets you index multiple repositories and query across them. In a microservice architecture where a single feature spans three or four services, this means Cody can trace a request through the API gateway, the auth service, the business logic service, and the database layer — even when each lives in a separate repository. I tested this with four interconnected repos and was able to ask “what happens end-to-end when a user updates their subscription tier?” and get a coherent trace across all four codebases. Without multi-repo, I would have had to ask the same question four times in four different editor windows and manually stitch the answer together.

On-premise deployment is available through Sourcegraph Enterprise, which runs the entire stack — the code graph indexer, the Cody AI gateway, and the IDE integration — inside a company’s own infrastructure. The LLM calls can be routed to self-hosted models or to cloud providers through customer-controlled API keys, giving security teams full control over where code and prompts travel. This is a hard requirement for defense contractors, financial institutions, and healthcare companies, and Cody is one of the few AI coding tools that supports it natively without requiring a third-party proxy.

// Cody correctly identified this as the canonical auth pattern in our monorepo,
// tracing it through the middleware stack to the policy engine.

// src/shared/auth/middleware.ts
import { verifyToken } from './jwt';
import { evaluatePolicy } from '../policy/engine';

export async function authMiddleware(req: Request, res: Response, next: NextFunction) {
  const token = req.headers.authorization?.split(' ')[1];
  if (!token) return res.status(401).json({ error: 'Missing token' });

  const claims = await verifyToken(token);
  const policyResult = await evaluatePolicy(claims.role, req.path, req.method);

  if (!policyResult.allowed) {
    return res.status(403).json({ error: policyResult.reason });
  }

  req.user = claims;
  next();
}

IDE Integration and Daily Workflow

Cody integrates with VS Code, JetBrains, and Neovim through extensions. The VS Code extension is the most mature — it surfaces context-aware autocomplete, a chat panel, and inline commands for explaining, fixing, and generating code. The key difference from generic AI extensions is that every interaction includes a “context” indicator showing which files Cody is referencing, and you can expand or narrow that context manually.

The autocomplete experience is slower than Copilot’s — roughly 400 to 700ms per suggestion versus Copilot’s 150 to 300ms in my testing — but the suggestions are more often correct because they are grounded in the actual codebase. After two weeks of daily use, my acceptance rate for Cody completions was around 68 percent, compared to roughly 55 percent for Copilot on the same codebase. That 13-point gap reflects the number of completions Copilot generated that looked reasonable but referenced functions or types that did not exist in the project.

The chat panel supports @-mentions for files, symbols, and repositories, which makes it straightforward to scope questions to specific parts of the codebase. I found myself using @file and @symbol mentions frequently to narrow Cody’s context for targeted questions, and omitting them for broader architectural questions where the full codebase context was useful.

Where Cody Falls Short

Cody does not generate code faster than Copilot. It generates code that is more likely to be correct, but the latency difference is noticeable and matters for workflows where completion speed is the primary concern. If you are writing boilerplate or filling in repetitive patterns, Copilot’s faster completions are the better tool for the job. Cody’s speed advantage is in debugging and understanding, not in raw generation throughput.

The indexing dependency is a double-edged sword. If your codebase is small — under 5,000 files — the indexing overhead is unnecessary and the context benefit is marginal. Copilot’s open-file plus neighbor-file context works well enough at that scale. Cody’s value proposition kicks in around the 10,000-file mark and becomes increasingly important as the codebase grows. Below that threshold, the setup cost is not worth the marginal accuracy improvement.

Pricing is also a factor. Cody is free for individual developers on the Personal plan, which includes codebase indexing and IDE integration. The Pro plan at $9 per month adds unlimited autocomplete and higher rate limits. Enterprise pricing includes multi-repo search, on-premise deployment, and admin controls, and is priced per seat with annual contracts. For individual developers, the free tier is genuinely usable. For teams, the enterprise features justify the cost if your codebase is large enough to benefit from codebase-aware context.

Cody Commands and Custom Recipes

Beyond the standard chat and autocomplete features, Cody includes a commands system that is genuinely underrated. Commands are pre-configured prompts that run against a specific context — explain the selected code, generate unit tests for the current file, find code smells in the open module, or document the public API of a package. These are not novel ideas, but Cody’s implementation benefits from the code graph in ways that generic implementations do not.

The “Generate Unit Tests” command, when run on a function that depends on three other internal modules, produced test files that imported the correct mocks for all three dependencies. Copilot’s equivalent command, when I tested it on the same function, generated a test that called the real implementations because Copilot could not see the dependency chain. The test file Cody produced was not perfect — it missed two edge cases — but the mock setup was correct, which is the part that takes the most time to write manually. I spent four minutes reviewing and adding edge cases rather than fifteen minutes writing mocks from scratch.

Custom recipes let you define your own commands with specific context requirements. My team set up a recipe that scans for breaking API changes before every pull request — it compares the current branch’s exported types and functions against the main branch and flags additions, removals, and signature changes. The recipe takes about 30 seconds to run on a 500-file diff and has caught two accidental interface changes that would have broken downstream consumers. This is the kind of workflow that codebase-aware AI enables and that generic assistants simply cannot perform because they lack the structural understanding of what changed and what depends on it.

// A custom Cody recipe for detecting breaking API changes.
// Because Cody's code graph knows the dependency chain, it can flag
// changes that affect downstream consumers, not just diff text.

// Recipe: breaking-change-detector
// Trigger: On PR diff against main branch
// Context: Full repository index
// Prompt:
// "Compare the exported API surface of the current branch against main.
//  For each change:
//  1. Classify as addition, removal, signature change, or behavioral change
//  2. If removal or signature change, list all internal callers affected
//  3. Flag changes with HIGH severity if they affect public API exports
//  4. Suggest migration paths for breaking changes"

FAQ

How does Cody compare to Cursor's codebase indexing?

Cursor indexes your codebase for context retrieval, similar to Cody, but Cursor's indexing is optimized for the current workspace while Cody's is built on Sourcegraph's multi-repo search infrastructure. For a single repository, the experience is comparable — both tools pull context from the broader codebase. For multi-repository architectures, Cody is ahead because Sourcegraph was designed for cross-repo search from the beginning.

Does Cody work with languages other than TypeScript?

Yes. Cody supports all major languages including Python, Go, Java, Rust, C++, Ruby, and PHP. The code graph indexing works across languages, and the quality of context retrieval is consistent across the supported language set. In my testing, TypeScript and Go had the strongest symbol resolution, while C++ template-heavy codebases occasionally produced imprecise results due to the complexity of C++ symbol resolution.

Can I use Cody without a Sourcegraph instance?

Yes — Cody Personal works with just the IDE extension and does not require a Sourcegraph server. The code graph indexing runs locally. Enterprise features like multi-repo search and on-premise deployment require a Sourcegraph instance, but the core codebase-aware AI functionality works standalone on your local machine.

Sourcegraph Cody Review: When Your Codebase Is Too Big for Copilot

How Cody’s Code Graph Works

Enterprise Features: Multi-Repo and On-Premise

IDE Integration and Daily Workflow

Where Cody Falls Short

Cody Commands and Custom Recipes

FAQ

Aider vs Continue.dev: Terminal-First vs Editor-First AI Coding in 2026

MCP Servers Worth Wiring Into Your Editor in 2026

AI Code Review Tools Compared: CodeRabbit, Greptile, and Diamond in 2026

Using Claude Code Subagents for Parallel Refactoring: A Hands-On Workflow

Cline vs Roo Code: Comparing Open-Source Agentic Coding Extensions in 2026

Get the best tools, weekly