Tokenyst Review: Track Claude Code API Costs Before the Bill Lands
A practical look at Tokenyst, an open-source local monitor that tracks Claude Code API token usage in real time and alerts you before runaway agent loops turn into surprise Anthropic bills.
The Claude Code bill story is familiar by now. You start with a few experimental sessions, the autonomy creeps up, the context windows balloon with codebases and tool results, and on a Monday morning you check your Anthropic console and the number on the screen is double what you budgeted. Cache hits help. Prompt compaction helps. But neither of them tells you what just happened in real time, while you can still do something about it.
Tokenyst is an open-source attempt at that missing piece — a local monitor that watches Claude Code API usage and warns you before the bill lands. We walked through the repo to see what it covers and where it fits in a working developer setup.
Why Claude Code bills creep up faster than you expect
Claude Code’s pricing is honest but not obvious. You pay per input and output token, and the cache discount is meaningful — cached reads run at roughly a tenth of the regular input price. The catch is that the things that drive your bill aren’t visible until after the fact:
- Long agent loops that re-read the same files dozens of times
- Subagents spawned with large prompts that don’t share your cache
- Tool results that balloon (a
findover a monorepo returns megabytes) - Compaction events that re-serialize context back into the prompt
The Anthropic console shows you yesterday’s totals. The Claude Code session UI shows you a running cost, but it’s session-scoped and easy to ignore when you’re heads-down. If you’re running multiple terminals, multiple projects, or a scheduled agent loop, the per-session number stops being useful.
That’s the gap Tokenyst tries to fill: a single place to watch token flow as it happens, across whatever you’re running.
What Tokenyst actually does
The repo describes Tokenyst as a real-time monitor for Claude Code token usage with configurable alerts. The pieces that matter:
- Live usage tracking. Token counts update as requests flow, not after a billing cycle.
- Threshold alerts. You set a daily, weekly, or per-session ceiling and Tokenyst notifies you before you cross it.
- Per-project attribution. Spend is bucketed by the project or workspace it came from, so a runaway agent in one repo doesn’t hide inside a shared total.
What it isn’t: a billing reconciliation tool, a replacement for the Anthropic console, or a multi-provider gateway. Tokenyst is scoped to Claude Code — and that focus is the point. Most generic LLM cost trackers treat Anthropic as one of N providers and miss the things that are specific to how Claude Code actually consumes tokens (the cache mechanics, the subagent fan-out, the compaction overhead).
Setting up Tokenyst against the Anthropic API
The setup pattern is the one you’d expect from a local observability tool. You clone the repo, install dependencies, and point it at your Anthropic API credentials so it can attribute usage to the right account. Thresholds live in a config file, which means they’re version-controllable — useful if you’re running this across a team and want a shared definition of what counts as too much.
A few things worth knowing before you wire it in:
- API key scope. Tokenyst needs read access to your usage. Use a scoped key, not your main one, and rotate it if anything looks off.
- Local persistence. Token history is stored locally by default. If you want longitudinal data, plan where that lives — a hidden tokenyst directory will grow steadily if you’re a heavy user.
- Notification channel. The defaults cover terminal and desktop notifications. Wiring it to Slack or email is a config addition, not a code change.
Where Tokenyst fits next to other cost-tracking options
The cost-tracking space for LLM developers has three rough tiers:
- The provider console. Anthropic’s own dashboard. Free, authoritative, but lagging — you see yesterday’s spend, not this hour’s.
- Multi-provider gateways. Helicone, Langfuse, OpenLLMetry. Designed for production apps, not for the messy reality of an interactive coding agent. Overkill if Claude Code is your only Anthropic surface.
- Local watchers like Tokenyst. Single-purpose, runs alongside your dev loop, optimized for the patterns Claude Code actually exhibits.
Tokenyst’s value proposition is the third tier. It won’t replace your provider invoice, and it won’t replace a production observability stack if you’re shipping an LLM product. What it will do is catch the “agent ran overnight and burned $40” failure mode before it becomes a $400 one.
The tradeoff: it’s young, it’s a single-maintainer project, and the polish gap versus a commercial tool is real. If you need SOC 2 reports and an SLA, this isn’t that. If you want a free, self-hosted check on your own Claude Code spend, it’s the kind of tool that pays for itself the first time it stops a runaway loop.
Cursor
If you're auditing your Claude API spend, the adjacent lever is which AI coding tool you route through. Cursor lets you bring your own Anthropic key so a monitor like Tokenyst keeps full visibility into agent workflows.
Free tier; Pro from $20/month
Affiliate link · We earn a commission at no cost to you.
FAQ
Does Tokenyst work with Claude Code's caching and subagents? +
Will Tokenyst slow down my Claude Code sessions? +
What if I'm already using Helicone or Langfuse? +
Related reading
2026-05-17
Hermes Memory Installer Review: One-Command Persistent Memory for Local AI Agents
Nous Research's Hermes Memory Installer adds local persistent memory to AI agents with one shell command. We compare its file-based approach to Mem0 and Letta.
2026-05-17
Anthropic Managed Agents Add 'Dreaming': Background Outcomes Without Your Own Loop
Anthropic's Managed Agents platform adds 'dreaming' — background agent execution that explores outcomes on Anthropic's infrastructure. How the new capability changes the build-vs-buy math for teams shipping on Claude.
2026-05-17
Anthropic Taps SpaceX's 220K-GPU Colossus 1 to Fix Claude Rate Limits
Anthropic reportedly secured access to SpaceX's 220,000-GPU Colossus 1 cluster to relieve Claude API capacity pressure. Here's what changes for the 529 errors and tight rate limits hitting your coding agents.
2026-05-17
Claude in Microsoft 365: Outlook Joins, Word/Excel/PowerPoint Hit GA
Anthropic is rolling Claude into Microsoft 365: Outlook gains support and Word, Excel, and PowerPoint integrations leave preview for general availability. Here's what changes for developers and which workflows actually benefit.
2026-05-17
MCP Server Token Bloat: 55,000 Tokens Wasted Before Your Agent Runs
Connecting MCP servers to Claude Code or Cursor silently injects 55K+ tokens of tool definitions into every turn. Here's the real cost — and how to cut it.
Get the best tools, weekly
One email every Friday. No spam, unsubscribe anytime.