Hello Again, Opus

By Eric Caskey · June 13, 2026 · 5 min read

Four days ago I said goodbye to Opus. Fable 5 was the new top of the lineup, a config swap away, and I pointed my whole fleet at it. On Friday a US export-control directive suspended Fable 5 and Mythos 5 globally. The fallback is the only model again. Hello again, Opus.

Three things: what I shipped, what it cost, and the plan I should have started with. Fable was fast and real, and running everything on it burned 90% of my usage in four days. Speed I can't sustain isn't a strategy. Next time: Fable thinks, Opus builds.

The four-day window#

Fable went generally available Tuesday and reached my tier the same day; the directive hit Friday. Four days, one model doing nearly everything across four product lines. What landed:

A live insider-buying signal. A SEC EDGAR collector pulling real Form 4 insider-buying into the signal pipeline, first of four crowd-signal sources.
A new research-synthesis workstream. Designed end to end and closed its first three items, including a methodology writeup now live on the public track record.
An MCP server for the finance engine. A single-user server that exposes the engine to Perplexity, behind an authenticated endpoint.
A deep-research pipeline. Routes hard questions through Perplexity's Sonar, synthesizes with Claude behind a hard spend cap, and runs on a schedule.
Upgraded public demos. Coach charts and chat, committee bars.
Account hardening, remediated the same day. CloudTrail, GuardDuty, Access Analyzer, a spend budget, repo rulesets across ten repos, Dependabot and secret scanning, found and fixed in one pass.
A finished tooling epic. Narration v2, portfolio briefs and narrative, automated board-brief generation, every item merged and flipped live.

That is a lot of ground, most of it long-horizon, multi-repo work that used to need me re-anchoring the agent around hour three. Fable held the plot. The goodbye post predicted it: when the model stops losing invariants over long runs, the bottleneck moves to the harness. For four days, it did.

What it cost#

It was also the most expensive four days I have run. Fable was included in my subscription, but it eats the usage cap roughly twice as fast as Opus 4.8 (the community's number, consistent with mine). By Friday I was at 90% of my allotment, before the directive even landed. If Anthropic had not pulled Fable, my own cap would have.

I ran everything on it, including work that did not need a frontier model. Much of the four days was autonomous backlog loops: /loop /backlog picking the next item and implementing it, hour after hour. Most are routine. Wire a collector, add a tab, flip a flag, write the tests. That is implementation, not reasoning, and a model burning the cap at double rate bought me little on the easy items.

The lesson is one I skipped in the rush to use the new model: match the model to the task, especially when the model is expensive and the loop is autonomous. An autonomous loop is a cost amplifier; it does not get tired and stop. Point a premium model at mostly-easy work and walk away, and its whole job becomes burning the cap fast. The cap resets next cycle; the habit that drained it does not. That is the part worth fixing.

The plan for next time#

Fable will come back. There is a timing wrinkle: on June 23 it leaves the subscription plans and becomes API usage credits only ($10/M input, $50/M output). But that looks temporary: Anthropic has reportedly signaled it may extend the date and return Fable to the plans, and the suspension is a separate, hopefully short, story. So I won't over-engineer around a metered window that may not last. Subscription or meter, the lesson is the same, because the burn rate is: Fable is a turbo mode, not a daily driver.

I am not the only one doing this math. The loudest thread in the Claude Code community right now is not "how good is Fable," it is "will you actually pay for Fable 5 via API usage credits after June 23rd?" The answers circle one instinct: "Fable orchestrates" a cheaper model underneath, a "temporal turbo mode" to lean on while it lasts. Right, but half-formed. The missing half is a rule for when you hit the boost.

The principle holds regardless of platform. Fable thinks, Opus builds. Reserve the frontier model for work where its judgment changes the outcome; let the workhorse carry the volume. Three changes make that a rule, not a good intention.

1. Tier the work, not the fleet. Opus by default: routine implementation, refactors, tests, the long tail of items that are clear once the design is settled. Fable for the hard stuff: architecture, spec design, ambiguous debugging, the planning step of an epic, the PR that scares me. The goodbye post had this shape as human_gate placement, gates where the model lost the plot. The same idea works for cost, as a model gate: most items run on Opus; an item tagged hard, or the plan-and-review bookends of an epic, get the boost.

2. Make the backlog loop a plan-with-Fable, build-with-Opus pipeline. The loop that ate my cap should not run end-to-end on the turbo model. Spend Fable once per epic on the expensive-but-rare parts: read the spec, sequence the items, flag the hard ones, write the design notes. Then let Opus grind the implementation. That is the community's "Fable orchestrates, cheaper model executes," wired into the orchestrator instead of left to my discipline.

3. Put the budget guard in the loop, not just my head. A 90%-in-four-days burn should trip something automatic. Make the loop cap-aware: past a threshold, finish on Opus and escalate to Fable only on an explicit hard tag. Same pattern as the hard dollar guard already on my Perplexity path, and it works whichever way Fable is billed. I trust that guard because it is in code. My restraint mid-loop is not, and four days proved it.

Goodbye again, for now#

Opus is the default again, except now I have seen the ceiling and have a plan to reach it without burning the budget. The goodbye post ended with "Fable, start telling." It told: a fifty-million-line migration in a day for Stripe, and four days of my backlog in four days. Then the directive cut it short, which is its own kind of fitting: the model that moved the long-horizon ceiling is the one a government most wants on a short leash.

Whenever Fable returns, it returns as turbo mode, not the default my fleet runs on. Fable thinks, Opus builds, and a budget guard keeps either from telling a story I can't afford to finish. Welcome back, Opus. You were never really gone.

Keep reading

Post

Fable Thinks, Sonnet Builds

I hit the Fable usage cap twice in under 48 hours and nearly ran out the total token limit. The plan that would have prevented it was published on this blog a month ago. Here is why it failed anyway, where the plan lives now, and what the routed patterns cost side by side.

Read

Post

Goodbye Opus, Hello Fable

Anthropic shipped Claude Fable 5 and Mythos 5: same model, two names, one safeguard layer apart. What the new frontier model means for running agents in production.

Read

Post

The Pocket Quant

I built a quant research platform, then built an agent to operate it: a scheduled Claude session that reads the boards, keeps a pre-registered track record, and texts me three times a day without ever saying buy.

Read

Post

Building an AI-Native Platform: A Retrospective

A year of building and operating a small fleet of finance and content products almost entirely through an AI coding agent. What worked, what was hard, the honest failures (including a flagship signal that measured nothing and an edge that vanished net of costs), and the lessons that transfer.

Read

Post

Prompt caching is a prefix match, not a flag

Prompt caching looks like a flag you flip for a cheaper bill. It is really the reuse of a stored prompt prefix, governed by three rules, and applying it across four parts of my own system showed where it pays, where it quietly does nothing, and where it is not even my decision. With the token counts I measured to check.

Read

Post

The Orange Pi That Maintains Itself

A small ARM box that started as a local LLM experiment and ended up a self-governing node: private retrieval, a resident agent under a written constitution, a code-enforced safety fence, and a nightly job where it audits itself and files its own backlog.

Read

Follow the work

New tools and writing as they ship — pick a channel.

RSS feed LinkedIn

Written by Eric Caskey. I build AI tools you can actually use. Explore the Tools or see the case studies.