Anthropic June 15 Pricing: Where Should Your Claude Personal Assistant Live?
Anthropic's June 15 pricing changes the math on hosting a Claude personal assistant: a decision framework for choosing Managed Agents in the cloud versus a local always-on Claude Code instance.
You can run a Claude personal assistant in two very different places, and for a long time the choice came down to whatever you set up first. Anthropic’s June 15 pricing update is a good reason to run the math instead.
A Claude assistant — the bot that triages your inbox, drafts your standup, or watches a deploy — can live as a Managed Agent on Anthropic’s infrastructure, or as a local, always-on Claude Code instance on a machine you control. Tools like Hermes Agent and OpenClaw sit on top of both; they orchestrate prompts, tools, and memory, but they do not answer the hosting question. Something still has to host the model calls and the file access. Here is how to decide what.
What the June 15 change actually reframes
Pricing updates are easy to skim as “numbers moved.” This one rewards a closer read, because it sharpens a distinction that was always there but easy to ignore: the difference between paying for work and paying for availability.
A Managed Agent is metered. It bills close to the work performed. An agent that fires twice a day costs roughly two runs’ worth of tokens a day, and effectively nothing while it sits idle between them.
A local always-on instance is the opposite. The machine runs whether the assistant is working or not. You carry that baseline — power, hardware, the server if it is not your laptop — no matter how many tasks actually land.
Metered work versus standing availability. Whichever direction Anthropic’s specific numbers move on June 15, that is the axis the decision turns on, and the date is a clean prompt to sort your assistants along it.
When Managed Agents win
Managed Agents win when the work is spiky, scheduled, or parallel.
Spiky work is event-driven: an assistant that reacts to a new GitHub issue, a Stripe event, or a calendar invite does nothing for hours and then needs to handle several things at once. On a single local instance those events queue behind each other. A Managed Agent fans them out.
Scheduled work is the 7 a.m. digest, the nightly repo summary, the Friday metrics roundup. It has to run when your laptop is closed. A local instance that must be awake to fire is the wrong tool — you would keep a machine on for 24 hours to do 15 minutes of work.
Parallel work is the research task you want in ten variations. Managed Agents let you start ten and collect the results; a local instance runs them one after another.
The cost structure favors this pattern bluntly. A scheduled agent that runs 60 times a month bills for 60 runs. Hosting the same job on an always-on machine means paying for roughly 720 hours of uptime to capture those same 60 runs — the standing cost dwarfs the work.
When a local Claude Code instance wins
A local Claude Code instance wins on the one thing a Managed Agent gives up: direct, unmediated access to your real environment.
Context. A local instance sees your actual repositories, your uncommitted changes, your shell history, the half-finished branch you were on this morning. A Managed Agent sees what you send it. For an assistant whose job is “help with whatever I am doing right now,” that gap is the entire product — there is no practical way to upload the state of your laptop to a cloud agent.
Files. If the assistant edits code, runs your test suite, and reads the build output, it needs a filesystem that is your filesystem. A Managed Agent can clone a repo, but it cannot see the local change you have not pushed — and unpushed changes are most of what matters while you are mid-task.
Permissions. A local instance runs with your credentials, on your network, behind your VPN. That is useful when the assistant legitimately needs an internal service, and it is a risk you hold directly instead of delegating to someone else’s infrastructure.
The trade is real: you carry the standing cost of an always-on machine, and you own patching and uptime. For an assistant you genuinely talk to while you work, that cost buys something a Managed Agent cannot sell you.
Cursor
If the assistant you actually want is Claude in your editor with full repo context, Cursor settles the hosting question for you — Claude runs against your local project files, with no agent infrastructure to provision or meter.
Free tier; paid plans from $20/month
Affiliate link · We earn a commission at no cost to you.
A decision you can make in five minutes
Run your assistant’s job description through three questions.
Does it need to run when you are away from your machine? If yes, Managed Agent. A scheduled or event-driven job should never depend on your laptop being awake.
Does it need to see your local, unpushed, in-progress state? If yes, local Claude Code. No amount of context-pasting reconstructs a working tree.
Is the work spiky or parallel? If yes, Managed Agent. Standing availability is an expensive way to absorb burst load.
Most people find they have two assistants, not one. The morning digest and the webhook handler are Managed Agent work. The pair-programmer that lives inside your repo is local work. After June 15, splitting them is the cheaper arrangement and not only the cleaner one — you stop paying for hundreds of hours of uptime to host a job that runs for 20 minutes a day.
FAQ
Do Hermes Agent and OpenClaw replace this decision? +
Can one assistant use both? +
Does the June 15 pricing make local instances obsolete? +
Related reading
2026-05-20
How to Build an Autonomous AI Coding Agent That Opens GitHub PRs Overnight
A practical breakdown of the plan-execute-verify loop behind an autonomous AI coding agent, and how to wire it to GitHub so an issue becomes a reviewable pull request overnight.
2026-05-20
Continual Harness: The Gemini Pokémon Agent That Rewrites Its Own Loop
How the Continual Harness pattern, from the Gemini Plays Pokémon and PokeAgent teams, lets an agent rewrite its own harness mid-run — plus how to apply that online-adaptation idea to autonomous agents you build.
2026-05-20
Apify Fingerprint Suite: Open-Source Browser Fingerprinting for Stealth Scrapers
Apify's fingerprint-suite generates statistically consistent browser fingerprints and injects them into Playwright or Puppeteer. How it works, how to wire it in, and when a scraper actually needs it.
2026-05-20
Judea Pearl's Ladder of Causation and the Limits of LLM Reasoning
Judea Pearl's three-rung causal hierarchy — association, intervention, counterfactual — explains why data-driven ML and LLMs hit a structural wall at causal reasoning, and what that means for agents and RAG.
2026-05-20
Optuna Tutorial: Automate Hyperparameter Tuning for ML Models in Python
How Optuna's define-by-run API, TPE sampler, and pruners automate hyperparameter tuning for scikit-learn, PyTorch, and TensorFlow models, with runnable Python code.
Get the best tools, weekly
One email every Friday. No spam, unsubscribe anytime.