Tokenyst Review: Track Claude Code API Costs Before the Bill Lands

The Claude Code bill story is familiar by now. You start with a few experimental sessions, the autonomy creeps up, the context windows balloon with codebases and tool results, and on a Monday morning you check your Anthropic console and the number on the screen is double what you budgeted. Cache hits help. Prompt compaction helps. But neither of them tells you what just happened in real time, while you can still do something about it.

Tokenyst is an open-source attempt at that missing piece — a local monitor that watches Claude Code API usage and warns you before the bill lands. We walked through the repo to see what it covers and where it fits in a working developer setup.

Why Claude Code bills creep up faster than you expect

Claude Code’s pricing is honest but not obvious. You pay per input and output token, and the cache discount is meaningful — cached reads run at roughly a tenth of the regular input price. The catch is that the things that drive your bill aren’t visible until after the fact:

Long agent loops that re-read the same files dozens of times
Subagents spawned with large prompts that don’t share your cache
Tool results that balloon (a find over a monorepo returns megabytes)
Compaction events that re-serialize context back into the prompt

The Anthropic console shows you yesterday’s totals. The Claude Code session UI shows you a running cost, but it’s session-scoped and easy to ignore when you’re heads-down. If you’re running multiple terminals, multiple projects, or a scheduled agent loop, the per-session number stops being useful.

That’s the gap Tokenyst tries to fill: a single place to watch token flow as it happens, across whatever you’re running.

What Tokenyst actually does

The repo describes Tokenyst as a real-time monitor for Claude Code token usage with configurable alerts. The pieces that matter:

Live usage tracking. Token counts update as requests flow, not after a billing cycle.
Threshold alerts. You set a daily, weekly, or per-session ceiling and Tokenyst notifies you before you cross it.
Per-project attribution. Spend is bucketed by the project or workspace it came from, so a runaway agent in one repo doesn’t hide inside a shared total.

What it isn’t: a billing reconciliation tool, a replacement for the Anthropic console, or a multi-provider gateway. Tokenyst is scoped to Claude Code — and that focus is the point. Most generic LLM cost trackers treat Anthropic as one of N providers and miss the things that are specific to how Claude Code actually consumes tokens (the cache mechanics, the subagent fan-out, the compaction overhead).

Setting up Tokenyst against the Anthropic API

The setup pattern is the one you’d expect from a local observability tool. You clone the repo, install dependencies, and point it at your Anthropic API credentials so it can attribute usage to the right account. Thresholds live in a config file, which means they’re version-controllable — useful if you’re running this across a team and want a shared definition of what counts as too much.

A few things worth knowing before you wire it in:

API key scope. Tokenyst needs read access to your usage. Use a scoped key, not your main one, and rotate it if anything looks off.
Local persistence. Token history is stored locally by default. If you want longitudinal data, plan where that lives — a hidden tokenyst directory will grow steadily if you’re a heavy user.
Notification channel. The defaults cover terminal and desktop notifications. Wiring it to Slack or email is a config addition, not a code change.

Where Tokenyst fits next to other cost-tracking options

The cost-tracking space for LLM developers has three rough tiers:

The provider console. Anthropic’s own dashboard. Free, authoritative, but lagging — you see yesterday’s spend, not this hour’s.
Multi-provider gateways. Helicone, Langfuse, OpenLLMetry. Designed for production apps, not for the messy reality of an interactive coding agent. Overkill if Claude Code is your only Anthropic surface.
Local watchers like Tokenyst. Single-purpose, runs alongside your dev loop, optimized for the patterns Claude Code actually exhibits.

Tokenyst’s value proposition is the third tier. It won’t replace your provider invoice, and it won’t replace a production observability stack if you’re shipping an LLM product. What it will do is catch the “agent ran overnight and burned $40” failure mode before it becomes a $400 one.

The tradeoff: it’s young, it’s a single-maintainer project, and the polish gap versus a commercial tool is real. If you need SOC 2 reports and an SLA, this isn’t that. If you want a free, self-hosted check on your own Claude Code spend, it’s the kind of tool that pays for itself the first time it stops a runaway loop.

Cursor

If you're auditing your Claude API spend, the adjacent lever is which AI coding tool you route through. Cursor lets you bring your own Anthropic key so a monitor like Tokenyst keeps full visibility into agent workflows.

Free tier; Pro from $20/month

Try Cursor

Affiliate link · We earn a commission at no cost to you.

FAQ

Does Tokenyst work with Claude Code's caching and subagents? +

Because it monitors at the API call level, cached reads and subagent calls show up as separate events with their own token counts. That makes it easier to spot when a subagent is bypassing your cache and inflating cost.

Will Tokenyst slow down my Claude Code sessions? +

By design, no — it reads usage data passively rather than sitting in the request path. The overhead is bounded by however often it polls or writes locally.

What if I'm already using Helicone or Langfuse? +

Keep them for production observability. Use Tokenyst alongside as the always-on dev-time alert. They solve different problems: Helicone is for the apps you ship, Tokenyst is for the agent you're actively coding with.

Tokenyst Review: Track Claude Code API Costs Before the Bill Lands

Why Claude Code bills creep up faster than you expect

What Tokenyst actually does

Setting up Tokenyst against the Anthropic API

Where Tokenyst fits next to other cost-tracking options

Cursor

FAQ

Hermes Memory Installer Review: One-Command Persistent Memory for Local AI Agents

Anthropic Managed Agents Add 'Dreaming': Background Outcomes Without Your Own Loop

Anthropic Taps SpaceX's 220K-GPU Colossus 1 to Fix Claude Rate Limits

Claude in Microsoft 365: Outlook Joins, Word/Excel/PowerPoint Hit GA

MCP Server Token Bloat: 55,000 Tokens Wasted Before Your Agent Runs

Get the best tools, weekly