Model file · Fable 5 · API

claude-fable-5 — the API page.

Q: What is the Fable 5 model id?

claude-fable-5. It is live on the Claude API from June 9, 2026 — fully available, with no plan-window gate (the June 22 clock applies to subscription plans, not the API). 1M-token context window, 128K-token max output, $10 per million input tokens and $50 per million output.

Q: What context window does Fable 5 have?

A 1M-token context window with a 128K-token maximum output. The prompt-cache floor also drops: the minimum cacheable prefix is 2,048 tokens on Fable 5 versus 4,096 on Opus 4.8.

Q: Why does thinking: disabled return 400 on Fable 5?

Fable 5 accepts adaptive thinking only. An explicit thinking: {type: "disabled"} — which Opus 4.7 and 4.8 accept — is rejected with a 400. The fix: omit the thinking parameter entirely, or send thinking: {type: "adaptive"}. This is the single new breaking change in the migration.

Q: Does Fable 5 support temperature?

No. temperature, top_p, and top_k were removed in the Opus 4.7-era API surface and return a 400 on Fable 5 as well — same for budget_tokens and last-assistant-turn prefills. If your code last touched a 4.6-or-older model, strip those before swapping the model string.

Q: Is the Fable 5 API the same as Opus 4.8?

Nearly. The request surface is identical — adaptive thinking, effort levels low through max, the same removed sampling parameters — plus exactly one new 400: an explicit thinking: {type: "disabled"} is rejected. Two quieter differences: the minimum cacheable prefix halves to 2,048 tokens, and the price doubles to $10/$50 per million tokens.

Q: What happens when Fable 5 blocks a request on the API?

The request returns an error rather than a model response, and you aren't charged for it. You can opt in to fallback on the Messages API to continue blocked requests on Claude Opus 4.8 at Opus pricing, roll your own fallback with the Anthropic SDK, or use Managed Agents where fallback is built in. One admin gate first: updated terms must be accepted in the Claude Console before the model works.

The whole launch is one string. If your model id lives in a variable, moving from Opus 4.8 to Fable 5 is a one-line diff — same request surface, same thinking config, same effort ladder — plus exactly one new 400 you'll only hit if you were explicitly disabling thinking. This page is the spec, the diff, and that 400.

Sourced from Anthropic's announcement and the model's own API surface. The buyer's view lives on the Fable 5 hub; the cost math on the pricing page.

Jump to section tap to open

The 30-second answer

Model id: claude-fable-5. Fully available on the Claude API from June 9, 2026 — no June 22 clock on the API side. 1M-token context, 128K max output, $10/$50 per million tokens (2× Opus 4.8). The request surface is identical to Opus 4.8 plus one new 400: explicit thinking: disabled is rejected — omit the param instead.

The spec block

Field	Fable 5	Note
Model id	`claude-fable-5`	the entire migration
Context window	1M tokens
Max output	128K tokens
Pricing	$10 in / $50 out per Mtok	exactly 2× Opus 4.8 ($5/$25)
Thinking	`thinking: {type: "adaptive"}` only	explicit `disabled` → 400
Effort	low · medium · high · xhigh · max	via `output_config`
Min cacheable prefix	2,048 tokens	4,096 on Opus 4.8

 One availability note operators will skim past: the June 9–22 included-in-plan window is a subscription thing — Pro, Max, Team, Enterprise seats. On the API the model is simply on, and the meter runs from request one. Plan-window strategy lives on the pricing page.

  Migrating from Opus 4.8 — and from older
  From Opus 4.8 or 4.7: the request surface is identical. Swap the model string, ship. Adaptive thinking carries over, the effort ladder carries over, your tool schemas carry over. The one exception is the new 400 below — and it only bites code that was explicitly disabling thinking.
 From 4.6 or older: the 4.7-era removals apply to you all at once. Three things in your old request will now return a 400:
  budget_tokens — gone. Adaptive thinking replaced manual thinking budgets; send thinking: {type: "adaptive"} and pick an effort level instead.
 temperature, top_p, top_k — removed. There is no sampling knob to tune on this surface.
 Last-assistant-turn prefills — rejected. If your prompt scaffold ends with a primed assistant turn, restructure it into the system prompt or the last user turn.
 
 None of these are Fable-specific — they're the Opus 4.7/4.8 surface, and Fable 5 inherits it unchanged. The head-to-head on whether the swap is worth 2× lives at Fable 5 vs Opus 4.8.
 
 
  The one new 400
  Here is the single place where Fable 5's API diverges from Opus 4.8: an explicit thinking: {type: "disabled"} returns a 400. Opus 4.7 and 4.8 accept it. Fable 5 rejects it. If you don't want to send a thinking config, omit the parameter entirely — don't send disabled.
 # before — Opus 4.8 (this exact request 400s on Fable 5)
resp = client.messages.create(
    model="claude-opus-4-8",
    max_tokens=8192,
    thinking={"type": "disabled"},   # accepted on 4.7 / 4.8
    messages=[...],
)

# after — Fable 5
resp = client.messages.create(
    model="claude-fable-5",
    max_tokens=128000,                   # the Fable 5 output ceiling
    thinking={"type": "adaptive"},       # or omit the param entirely
    output_config={"effort": "xhigh"},   # for agentic work
    messages=[...],
)
    Warning 
  
Grep before you swap. thinking.*disabled across your codebase takes ten seconds and is the difference between a one-line migration and a 400 in production. It hides in the boring places — eval harnesses, cheap classification paths, anywhere someone once turned thinking off to save tokens.
 
 
 
 Effort is the knob that survives: low / medium / high / xhigh / max, all supported. For long-horizon agentic work, xhigh is where the launch table's headline number lives — FrontierCode Diamond was reported at xhigh. For bulk extraction, drop the effort, not the model's thinking.
 
 
  The 2,048-token cache floor
  The quietest spec change is the one that touches your bill. The minimum cacheable prefix on Fable 5 is 2,048 tokens — on Opus 4.8 it's 4,096. A 3K-token system prompt with a cache_control breakpoint caches on Fable 5 and silently doesn't on Opus 4.8. No error either way. The invoice is the only witness.
 At $10 per million input tokens, this floor matters more than it did at $5. Cache reads run at ~0.1× — so a cached prefix that re-fires across an agent loop is the difference between paying frontier price once and paying it every turn. If your prompts hover in the 2K–4K band, Fable 5 is the first Opus-class surface where they cache at all. The full token math is Ch 29's territory.
    Note 
  
Audit your breakpoints when you swap. A prefix tuned to clear Opus 4.8's 4,096-token floor may be carrying padding it no longer needs.
 
 
 
 
 
  Blocked requests + fallback — the wire behavior
  Fable 5 ships with classifiers in front of three areas — offensive cyber, biology/chemistry, capability distillation. On the Messages API the default behavior is clean: a blocked request returns an error, not a model response — and you aren't charged for it. Three ways to handle it:
  Opt in to fallback on the Messages API — blocked requests continue on Claude Opus 4.8, billed at Opus pricing ($5/$25), not Fable pricing.
 Roll your own — catch the error and re-route with the Anthropic SDK's support for exactly this.
 Managed Agents — fallback is built in; nothing to wire.
 
 Per Anthropic, more than 95% of Fable sessions involve no fallback at all. The operator read for API builders: this is routing information, not fine print. If your product is security tooling or bio-adjacent, expect Opus-4.8-grade answers on those paths by design — the starred rows in the launch table already priced this in. Route those workloads to Opus 4.8 directly at $5/$25 instead of eating a classifier round-trip to end up there anyway. The decision table is on the vs-Opus page.
 Two more lines of fine print that gate real deployments: Fable 5 requires a limited 30-day retention period — retained data is used only to detect and prevent serious misuse, never to train Claude — and admins must accept updated terms in the Claude Console before the model works. Budget five minutes for the Console before you budget an afternoon for the migration.
 
 
  The advisor seat — the cheapest way to buy Fable 5's judgment
  The launch detail most API builders will skim past is the one with the best economics: Fable 5 is available as an advisor model. Faster, lower-cost worker models call it mid-task to check their plan and evaluate their work — Anthropic's own framing, and their own claim is that it leads to improved performance. You don't put the $10/$50 model in the driver's seat; you put it at the judgment gate.
 If you've read this book, you've seen this shape before. It's the conductor-and-judge split from Ch 6 and the second-opinion loop from the swarm patterns — workers fan out on Sonnet-tier, one expensive model verifies plans and grades output. The difference is that it's now a first-party API primitive instead of an orchestration pattern you build yourself. Same logic as the Amdahl gate in the research notes: the bottleneck is judgment, so spend the frontier tokens on judgment.
 And the pairing Anthropic built for the long-horizon story: Claude Managed Agents, their harness for long-running agentic work, is now in public beta — Fable 5 works with it out of the gate, no changes needed. Fallback included.
 
 
  The swappable-id discipline — this launch was the test
  Ch 30 wrote this exact scenario before the model existed: a 34-line SDK-direct file where the next frontier model is a one-line swap — the cache control still works, the tool schema still works, the retry config still works. That chapter used a hypothetical id. Today the hypothetical has a real name: claude-fable-5.
 The other half of the prediction held too. SDK-direct paths can test Fable 5 today. Framework-shaped paths — CrewAI, LangGraph, the agent runtimes — wait for the framework to publish support: provider config, adapter updates, retry semantics for the new 400. That's structural lag, not Fable-specific lag, and it's the same lag Ch 30 flagged at the Mythos disclosure. Ch 2 makes the stack fixed lanes; Ch 30 makes the model inside each lane a variable — and this morning is what that discipline was for.
    A frontier launch should cost you a diff, not a project. One string, one param to delete, one new 400 — that's the whole bill. 
  
 
 If the model id is hardcoded in nine places across your stack, today is the day you find out. Fix that first, swap second, and let the tier list decide which workloads earn the 2× rate. The 400 will find you if you earned it.
 
 
  FAQ
   What is the Fable 5 model id?
 claude-fable-5. It is live on the Claude API from June 9, 2026 — fully available, with no plan-window gate (the June 22 clock applies to subscription plans, not the API). 1M-token context window, 128K-token max output, $10 per million input tokens and $50 per million output.
 
 What context window does Fable 5 have?
 A 1M-token context window with a 128K-token maximum output. The prompt-cache floor also drops: the minimum cacheable prefix is 2,048 tokens on Fable 5 versus 4,096 on Opus 4.8.
 
 Why does thinking: disabled return 400 on Fable 5?
 Fable 5 accepts adaptive thinking only. An explicit thinking: {type: "disabled"} — which Opus 4.7 and 4.8 accept — is rejected with a 400. The fix: omit the thinking parameter entirely, or send thinking: {type: "adaptive"}. This is the single new breaking change in the migration.
 
 Does Fable 5 support temperature?
 No. temperature, top_p, and top_k were removed in the Opus 4.7-era API surface and return a 400 on Fable 5 as well — same for budget_tokens and last-assistant-turn prefills. If your code last touched a 4.6-or-older model, strip those before swapping the model string.
 
 Is the Fable 5 API the same as Opus 4.8?
 Nearly. The request surface is identical — adaptive thinking, effort levels low through max, the same removed sampling parameters — plus exactly one new 400: an explicit thinking: {type: "disabled"} is rejected. Two quieter differences: the minimum cacheable prefix halves to 2,048 tokens, and the price doubles to $10/$50 per million tokens.
 
 What happens when Fable 5 blocks a request on the API?
 The request returns an error rather than a model response, and you aren't charged for it. You can opt in to fallback on the Messages API to continue blocked requests on Claude Opus 4.8 at Opus pricing, roll your own fallback with the Anthropic SDK, or use Managed Agents where fallback is built in. One admin gate first: updated terms must be accepted in the Claude Console before the model works.
 
 
 
  The Fable 5 files
  Fable 5 vs Mythos 5One model, two names — the safeguards, the fallback, the gated twin. Benchmarks, read honestlyAll thirteen benchmarks, the starred-row caveat, and the reward-hacking discount. Fable 5 vs Opus 4.8Upgrade or wait — the 2× sticker against the turn-count collapse. Fable 5 vs GPT 5.5 vs Gemini 3.1 ProThe cross-vendor read, including where the rivals' CLIs hold up. Pricing + cost per task$10/$50, the plan window, and the Ch 29 math on 2× stickers. Use casesStripe's 50M-line day, Cursor, GitHub, trading desks, drug design — and the operator's own. Fable 5 in Claude CodeThe banner, the June 22 clock, /model, and when to route to it. The API pageclaude-fable-5, the one new 400, and the one-line migration from Opus 4.8.
 
 
Related: The Fable 5 hub · Ch 2 — the five-tool stack · Ch 30 — SDK-direct · Ch 29 — cost economics · The live tier list

claude-fable-5 — the API page.

The 30-second answer

The spec block

Migrating from Opus 4.8 — and from older

The one new 400

The 2,048-token cache floor

Blocked requests + fallback — the wire behavior

The advisor seat — the cheapest way to buy Fable 5's judgment

The swappable-id discipline — this launch was the test

FAQ

The Fable 5 files

The next edition lands when this list says it does.