Tokenyst Review: Track Claude Code API Costs Before the Bill Lands
A practical look at Tokenyst, an open-source local monitor that tracks Claude Code API token usage in real time and alerts you before runaway agent loops turn into surprise Anthropic bills.
The Claude Code bill story is familiar by now. You start with a few experimental sessions, the autonomy creeps up, the context windows balloon with codebases and tool results, and on a Monday morning you check your Anthropic console and the number on the screen is double what you budgeted. Cache hits help. Prompt compaction helps. But neither of them tells you what just happened in real time, while you can still do something about it.
Tokenyst is an open-source attempt at that missing piece — a local monitor that watches Claude Code API usage and warns you before the bill lands. We walked through the repo to see what it covers and where it fits in a working developer setup.
Why Claude Code bills creep up faster than you expect
Claude Code’s pricing is honest but not obvious. You pay per input and output token, and the cache discount is meaningful — cached reads run at roughly a tenth of the regular input price. The catch is that the things that drive your bill aren’t visible until after the fact:
- Long agent loops that re-read the same files dozens of times
- Subagents spawned with large prompts that don’t share your cache
- Tool results that balloon (a
findover a monorepo returns megabytes) - Compaction events that re-serialize context back into the prompt
The Anthropic console shows you yesterday’s totals. The Claude Code session UI shows you a running cost, but it’s session-scoped and easy to ignore when you’re heads-down. If you’re running multiple terminals, multiple projects, or a scheduled agent loop, the per-session number stops being useful.
That’s the gap Tokenyst tries to fill: a single place to watch token flow as it happens, across whatever you’re running.
What Tokenyst actually does
The repo describes Tokenyst as a real-time monitor for Claude Code token usage with configurable alerts. The pieces that matter:
- Live usage tracking. Token counts update as requests flow, not after a billing cycle.
- Threshold alerts. You set a daily, weekly, or per-session ceiling and Tokenyst notifies you before you cross it.
- Per-project attribution. Spend is bucketed by the project or workspace it came from, so a runaway agent in one repo doesn’t hide inside a shared total.
What it isn’t: a billing reconciliation tool, a replacement for the Anthropic console, or a multi-provider gateway. Tokenyst is scoped to Claude Code — and that focus is the point. Most generic LLM cost trackers treat Anthropic as one of N providers and miss the things that are specific to how Claude Code actually consumes tokens (the cache mechanics, the subagent fan-out, the compaction overhead).
Setting up Tokenyst against the Anthropic API
The setup pattern is the one you’d expect from a local observability tool. You clone the repo, install dependencies, and point it at your Anthropic API credentials so it can attribute usage to the right account. Thresholds live in a config file, which means they’re version-controllable — useful if you’re running this across a team and want a shared definition of what counts as too much.
A few things worth knowing before you wire it in:
- API key scope. Tokenyst needs read access to your usage. Use a scoped key, not your main one, and rotate it if anything looks off.
- Local persistence. Token history is stored locally by default. If you want longitudinal data, plan where that lives — a hidden tokenyst directory will grow steadily if you’re a heavy user.
- Notification channel. The defaults cover terminal and desktop notifications. Wiring it to Slack or email is a config addition, not a code change.
Where Tokenyst fits next to other cost-tracking options
The cost-tracking space for LLM developers has three rough tiers:
- The provider console. Anthropic’s own dashboard. Free, authoritative, but lagging — you see yesterday’s spend, not this hour’s.
- Multi-provider gateways. Helicone, Langfuse, OpenLLMetry. Designed for production apps, not for the messy reality of an interactive coding agent. Overkill if Claude Code is your only Anthropic surface.
- Local watchers like Tokenyst. Single-purpose, runs alongside your dev loop, optimized for the patterns Claude Code actually exhibits.
Tokenyst’s value proposition is the third tier. It won’t replace your provider invoice, and it won’t replace a production observability stack if you’re shipping an LLM product. What it will do is catch the “agent ran overnight and burned $40” failure mode before it becomes a $400 one.
The tradeoff: it’s young, it’s a single-maintainer project, and the polish gap versus a commercial tool is real. If you need SOC 2 reports and an SLA, this isn’t that. If you want a free, self-hosted check on your own Claude Code spend, it’s the kind of tool that pays for itself the first time it stops a runaway loop.
Cursor
If you're auditing your Claude API spend, the adjacent lever is which AI coding tool you route through. Cursor lets you bring your own Anthropic key so a monitor like Tokenyst keeps full visibility into agent workflows.
Free tier; Pro from $20/month
Affiliate link · We earn a commission at no cost to you.
FAQ
Does Tokenyst work with Claude Code's caching and subagents? +
Will Tokenyst slow down my Claude Code sessions? +
What if I'm already using Helicone or Langfuse? +
Related reading
2026-05-26
Orthrus: Parallel Token Generation That Doesn't Change Your Model's Output
Orthrus injects diffusion attention into each layer of a frozen autoregressive Transformer to generate 32 tokens in parallel — without altering the base model's output distribution.
2026-05-26
NVIDIA Warp Review: GPU-Accelerated Python for Simulation, Robotics, and Differentiable ML
NVIDIA Warp compiles Python functions to CUDA kernels for differentiable physics and robotics. We benchmarked it against JAX and Taichi to figure out when it earns a spot in your stack.
2026-05-26
OpenAI Daybreak vs Anthropic Glasswing: Convergent Bets on LLM Security Tooling
OpenAI's Daybreak (GPT-5.5 + Codex Security) and Anthropic's Glasswing shipped near-identical AppSec products the same week. What the convergence means and how to pick.
2026-05-26
Macchiato Day 2: Live Token Metrics and Parallel AI Terminals Reviewed
Macchiato's day-2 build adds a live token/cost sidebar and keyboard shortcuts for swapping between Claude Code and OpenCode in one terminal. Here's what shipped and what it means.
2026-05-26
Macchiato Day 2: Live Token Metrics and Parallel Terminals for Claude Code and OpenCode
Macchiato Day 2 adds a 2-4 pane terminal grid, live token and cost meters, and configurable spend ceilings for Claude Code and OpenCode sessions. Here is what it actually does and who should install it.
Get the best tools, weekly
One email every Friday. No spam, unsubscribe anytime.