The MCP Servers Worth Installing in 2026: A Curated Review
After running coding agents for months with too many MCP servers and then too few, here is the short list of Model Context Protocol servers that actually earn their context-window cost — plus the ones quietly wasting your tokens.
I spent the early part of this year doing the thing everyone does when they discover the Model Context Protocol: I installed everything. Filesystem, git, GitHub, Postgres, SQLite, a web fetcher, two search servers, a browser automation server, a memory server, and a handful of niche ones I no longer remember. My Claude Code config looked like a Christmas tree. And for about a week it felt powerful — the agent could touch everything. Then I started reading my own context windows, and the feeling curdled. Before the agent had read a single line of my code, something like a fifth of the context was already gone, eaten by tool schemas describing capabilities the agent would never use in that session. I had built a kitchen sink and was paying rent on every faucet.
This guide is the correction. After running coding agents — Claude Code mostly, with stints in Cursor, Cline, and Goose — against real work for months, I have a short list of MCP servers that consistently earn their keep, organized by category. For each one I will say what it does, when it justifies its context cost, and where it quietly bloats your token budget. The thesis is simple and I will repeat it: fewer, sharper servers beat a kitchen sink, every time.
What MCP Actually Costs You
The Model Context Protocol, introduced by Anthropic in late 2024 and now broadly supported across agentic coding tools, is a standard way for an AI agent to talk to external systems — files, databases, APIs, browsers — through a uniform interface. A server exposes “tools” (callable functions), “resources” (readable data), and sometimes “prompts.” Your agent’s host loads these and makes them available to the model. The protocol itself is sound and genuinely useful; the problem is operational.
Here is the part the launch demos skip. When you connect an MCP server, the host typically injects the full schema of every tool that server exposes into the model’s context — names, descriptions, parameter definitions, the lot — at the start of the session, whether or not the agent ever calls any of them. A modestly complex server can carry a dozen tools, each with a paragraph of description and a nested JSON schema. Stack five or six servers and you can burn tens of thousands of tokens describing capabilities before the agent has even read your prompt. I have measured single configs where the combined tool definitions exceeded what an entire source file would have cost to load.
This is why I no longer evaluate a server by “is it useful?” Almost all of them are useful in some scenario. I evaluate by “does its usefulness, in the work I actually do, exceed the fixed context tax it charges every session?” Most servers fail that test most of the time. A few pass it almost always.
The Core Three: Filesystem, Git, and Fetch
If I could only run three servers, these would be them — and for a lot of day-to-day coding, three is plenty.
The filesystem server gives the agent scoped read and write access to directories you whitelist. In practice many coding agents already have native file access, so the standalone filesystem MCP server earns its place mainly when you are using a host that lacks built-in file tools, or when you want to grant access to a directory outside the project root — a shared assets folder, a sibling repo, a docs tree. When you do need it, keep the allowed paths tight. A filesystem server pointed at your home directory is both a security liability and a way to tempt the agent into wandering.
The git server is the one I underestimated longest. It exposes structured git operations — status, diff, log, blame, branch inspection — as tools rather than leaving the agent to shell out to git and parse text. The value is precision: the agent gets clean, structured diffs and history instead of scraping terminal output, which it does less reliably than you would hope. For any workflow involving “what changed,” “why did this line get added,” or careful staged commits, the git server pays for itself. It is also lean — a handful of focused tools, modest schema cost.
The web fetch and search servers are where things get nuanced, because they are two different jobs often bundled in conversation. A fetch server retrieves a known URL and returns its content as clean text or markdown — invaluable when you tell the agent “read the docs at this link” or “pull this changelog.” A search server queries a search engine or index and returns ranked results, for when the agent needs to discover sources it does not already have. I run a fetch server almost always; it is cheap and high-leverage. I add a search server more selectively, because the good ones (the ones wired to a real search API) can carry heavier schemas and, depending on the provider, real per-query cost.
Situational Power: Databases and Browser Automation
These are the servers I install per-task and remove when the task is done. They are powerful enough to justify the friction of toggling them on and off, which is itself a signal: if a server is worth running only sometimes, it should not live in your always-on config.
Database servers — Postgres and SQLite are the two I reach for — let the agent inspect schemas, run queries, and reason about your data structure directly. The Postgres server is genuinely excellent for the specific job of “understand this schema and help me write a migration” or “why is this query slow, look at the actual table.” It can introspect tables, indexes, and constraints far faster than feeding the agent a schema dump by hand. The SQLite server is the same idea at smaller scale, perfect for local app databases and analytics files. The caution is obvious but worth stating plainly: connect these read-only unless you have a concrete reason not to, and never against production with write access. An agent with a writable production database connection is a risk profile I am not willing to carry.
Browser automation via the Playwright server is the heavyweight of this list — and the most useful when its moment arrives. It drives a real browser: navigating, clicking, filling forms, reading the DOM, taking screenshots. For end-to-end test authoring, for debugging “the page renders wrong” issues where the agent genuinely needs to see the rendered result, and for scraping flows that defeat a plain fetcher, nothing else comes close. But it is also the server most likely to carry a large tool surface, because driving a browser is intrinsically many small operations. I install Playwright for a testing session and pull it afterward. Leaving it loaded full-time is the single most common token-budget mistake I see in other people’s configs.
GitHub and Memory: The Deliberate Trade-offs
Two more servers deserve their own section, because both are widely recommended as defaults and I think both should be deliberate choices instead.
The GitHub server exposes the GitHub API as tools — issues, pull requests, repository contents, reviews, actions. It is legitimately handy for “triage these open issues” or “summarize this PR thread” or “open a PR with this description.” The catch is that the GitHub API is enormous, and a comprehensive GitHub server reflects that with a large, schema-heavy toolset. If your work is mostly local coding with the occasional PR, you are paying a steep context tax for capabilities you touch rarely. My compromise: I run the GitHub server during issue-triage or review-heavy sessions and lean on the lightweight git server plus the gh CLI the rest of the time. The CLI route costs almost nothing in schema and covers most of what I actually do.
The memory server is the most philosophically interesting and the one I am most cautious about. It gives the agent a persistent store — typically a knowledge graph or a key-value layer — to remember facts across sessions: project conventions, decisions, your preferences. The promise is an agent that does not re-learn your codebase every morning. In practice the results are mixed. A memory server is only as good as what gets written to it, and agents are inconsistent about what they choose to remember. I have seen memory servers accumulate stale or contradictory facts that actively mislead later sessions. When it works it is lovely; when it drifts it is worse than nothing. I treat it as an experiment per project, not a default, and I periodically read what it has stored to prune the junk.
How the Curated List Stacks Up
Here is my actual decision framework, condensed. Read the “best for” column as “the one situation where this server clearly earns its context cost.”
| Tool | Category | What It Does | Context Cost | Keep Loaded? |
|---|---|---|---|---|
| Filesystem Best for Granting access to a sibling repo or shared directory | Scoped file read/write outside native access | Low | Only if host lacks native file tools | |
| Git Best for Careful diffs and "why did this change" history work | Structured diff, log, blame, status | Low | Yes — almost always worth it | |
| Fetch Best for Reading docs and changelogs you point the agent at | Retrieve a known URL as clean text | Low | Yes — cheap and high-leverage | |
| Search Best for Discovering sources the agent does not already have | Query a search engine, ranked results | Medium | Selectively | |
| Postgres / SQLite Best for Migrations and "look at the real schema" debugging | Schema introspection and queries | Medium | Per-task, read-only | |
| Playwright Best for E2E test authoring and rendered-page debugging | Real browser automation and screenshots | High | Per-task only | |
| GitHub Best for Issue triage and PR-thread summarization | Issues, PRs, reviews via GitHub API | High | Only for triage/review sessions | |
| Memory Best for Stable project conventions that rarely change | Persistent cross-session knowledge store | Medium | Experimental, per-project |
The shape of that table is the whole argument. Three low-cost servers belong in your default config. Everything else is a per-task decision weighed against its context tax. If your config has more than four or five always-on servers, I would bet money some of them are dead weight you stopped noticing.
A Few Practical Install Notes
Most coding agents configure MCP servers through a JSON file — Claude Code and Cursor both use a settings file where each server gets a command and arguments, frequently launched via npx for the reference servers or a dedicated binary for others. The mechanics are well documented per tool, so I will skip the boilerplate and give the operational advice that actually matters.
First, scope every server to the minimum it needs. Filesystem servers take allowed-directory arguments; use them. Database servers take connection strings; point them at read replicas or local copies, not production primaries. Second, prefer servers that let you disable individual tools if the host supports it — trimming an oversized server’s tool list is the cheapest token win available. Third, audit periodically: open a fresh session, look at what the context contains before you have typed anything substantive, and ask whether each loaded server has earned its place since you last checked. I do this roughly monthly and almost always remove something.
Finally, resist the marketplace instinct. The ecosystem of available MCP servers is large and growing, and a great many of them are perfectly real and perfectly fine. That is exactly the trap. Availability is not a reason to install. The right question is never “could this be useful?” — it is “is this useful enough, in my actual work, to justify what it costs me on every single turn?” Answer that honestly and your config gets short fast.
Who Should Run Which Servers
If you are a solo developer doing mostly local coding, run filesystem (if your host needs it), git, and a fetch server. That is it. Add a database server the day you touch a database and remove it the day after. You will be astonished how much you do not miss.
If you work on a team with heavy PR and review workflows, the GitHub server starts to earn its cost — but consider running it in a dedicated “review mode” config rather than your everyday coding config, so your normal sessions stay lean.
If you do QA, test authoring, or front-end debugging, Playwright is your high-value pick, installed per session. Pair it with fetch and git and skip almost everything else.
If you are tempted by the memory server, try it on exactly one project, read what it stores after a week, and decide based on whether the stored facts are helping or quietly lying to your agent. Do not roll it out everywhere on faith.
And if you are reading this because your agent feels sluggish or keeps reaching for the wrong tool, the fix is almost certainly subtraction. Open your config, count your always-on servers, and start cutting. The best MCP setup I have ever run is the smallest one that still does the job.
FAQ
How many MCP servers should I run at once?+
Why do MCP servers waste so many tokens?+
Is the filesystem MCP server necessary if my agent already reads files?+
Can I let the agent write to my database through an MCP server?+
Is the memory MCP server worth using?+
Related reading
2026-06-04
Cline Review: The Open-Source Autonomous Coding Agent for VS Code in 2026
A hands-on review of Cline, the open-source VS Code coding agent — plan/act modes, per-step diff approval, MCP support, and a bring-your-own-key cost model that bills you at provider rates with no markup.
2026-06-04
Goose CLI Review: Block’s Open-Source Agent After the Linux Foundation Handoff
A hands-on review of Goose, Block's open-source on-machine AI agent — provider-agnostic config, MCP extensions, the CLI session-and-recipe workflow, and how it stacks up against Claude Code and OpenCode.
2026-06-04
JetBrains Junie Review: IntelliJ's Native AI Coding Agent, Tested
I ran JetBrains Junie inside IntelliJ IDEA for two weeks on a large Kotlin codebase. Here is where the native IDE integration actually pays off, and where the quota and IDE weight bite back.
2026-06-04
Supermaven vs Codeium: Free AI Autocomplete Compared in 2026
Two of the best free AI autocomplete tools, tested head-to-head — latency, context window, IDE support, free-tier limits, and what Supermaven's move into Cursor means for its standalone future.
2026-05-28
NVIDIA Nemotron Omni: What the Multimodal Model Means for Agent Builders
NVIDIA's Nemotron Omni unifies text, vision, and audio in one model. Here's how developers can wire it into agent stacks — and where the rough edges still are.
Get the best tools, weekly
One email every Friday. No spam, unsubscribe anytime.