Your frontier model is doing intern work.
You’d think Claude Code already sends the routine work it spawns to cheaper models. It doesn’t — every sub-task runs on your most expensive model unless something tells it which one to use, and your weekly usage limit dies in hours, not days. ax measures the leak, warns as it happens, learns the fix from your own history, and verifies the savings — all on your laptop.
what the leak looks like on one real machine running ax · 14 days of receipts
$2,301 of sub-task spend on fable/opus vs $83 on sonnet, on this machine. Run ax cost split to see yours.
Measure. Nudge. Tune. Verify.
Four commands close the loop — see where the expensive defaults hide, get warned before the next one, learn the fix from what your agents actually do, then reprice against the tokens your sub-tasks actually burned.
ax cost split --days=7Breaks your spend into main session vs the sub-tasks your agent spawns, by model. The 28:1 above came out of this table — run it to see your ratio.
ax hooks install ~/.ax/hooks/route-dispatch.ts --providers=claudeThe route-dispatch hook warns in the moment, right as a routine sub-task is about to run on the expensive default.
ax routing tuneFinds the routine work you keep overpaying for, from what your agents actually do. Deterministic pattern-matching — no AI guessing.
ax dispatches --candidatesReprices every overbilled sub-task against what the cheaper model would have cost, from the actual tokens it burned. Your savings, not projections.
One machine, verbatim.
Over 30 days: 20 patterns of routine work found, $591.57 of addressable spend. Reprice the last 14 days against the cheaper model and $512.91 is recoverable — the same machine as the numbers above. ax routing tune --dry-run prints yours in one command.
$ ax routing tune --dry-run --days=30 20 proposals addressable spend: $591.57 (30 days) apply non-judgment: ax routing tune --days=30 brief: ax routing tune --emit-brief $ ax dispatches --candidates --days=14 total est savings: $512.91 top classes: well-specified-impl ($222.78), spec-review ($69.75), bug-fix ($62.08)
Honest numbers, on purpose: “addressable spend” is what the flagged sub-tasks actually cost over the window — yours included. ax reprices what already happened, from the actual tokens burned, and never reports fabricated projected savings.
Judgment work never tiers down.
Your obvious objection: won’t quality drop? No — ax refuses to auto-route anything that needs taste. Your reviews, your design calls, your plans stay on the frontier model.
tiers down automatically
- File search & repo recon (billed at opus rates today)
- Well-specified implementation (the spec did the thinking)
- Bug fixes with a repro (mechanical once located)
- Verification & test runs (pass/fail, not judgment)
stays on the frontier model
- Code review (quality gate, never auto-routed)
- Design & UX (taste does not tier down)
- Planning & architecture (the expensive mistakes)
- Audits (the thing checking the cheap work)
Route the expensive model where it earns its keep.
Everything runs local. Your transcripts never leave your machine.
$ curl -fsSL https://ax.necmttn.com/install | bash$ ax ingest # first run: build the graph from your history$ ax routing tune