Opus 4.7 Just Landed — Here's What Actually Changed

Stack Snacks

0:00

-4:17

Opus 4.7 Just Landed — Here's What Actually Changed

John Siwicki

Apr 20, 2026

Hey Snackers 👋

Anthropic dropped Claude Opus 4.7 last week — an upgrade to their flagship model — and this one’s worth a real look.

The headline: this is a coding model you can actually hand work off to. Not “hand it simple stuff and babysit it” — hand it hard, multi-step, hours-long tasks and walk away. At least, that’s the claim.

But there’s more going on under the hood than just better code. Better eyes, tighter instruction following, and a brand new approach to cybersecurity that’s quietly one of the most interesting things any AI lab has done this year.

Let’s break it down.

The Big Upgrade: Software Engineering

The main event with Opus 4.7 is coding. Anthropic says early testers were able to delegate their hardest, most supervision-heavy tasks to the new model with confidence — something they couldn’t reliably do with Opus 4.6.

What makes it different? Three things:

Self-verification. The model checks its own work before reporting back. It catches logical faults during planning and doesn’t just plow ahead hoping for the best.
Long-horizon consistency. Multi-step tasks that span many turns — the kind that used to degrade or lose the plot — now stay on track.
Benchmark results. State-of-the-art on SWE-bench (the standard software engineering benchmark), including multilingual and multimodal variants. On a 93-task coding benchmark, it improved resolution by 13% over Opus 4.6 — including 4 tasks that no previous Claude model could solve.

One early tester put it well: “Low-effort Opus 4.7 is roughly equivalent to medium-effort Opus 4.6.” Translation: it’s doing more with less thinking time.

4 More Things Worth Knowing

1. 3x Better Vision

Opus 4.7 accepts images up to 2,576 pixels on the long edge (~3.75 megapixels). That’s 3x the resolution of previous Claude models. If you’re using Claude for reading screenshots, extracting data from diagrams, or any computer-use agent work — this is a big deal. Heads up: higher-res images eat more tokens.

2. Instruction Following Got Too Good

This one’s interesting. Opus 4.7 takes instructions much more literally than previous models. Anthropic actually warns that prompts written for older versions might produce unexpected results — because older models would loosely interpret or skip steps. If you’re migrating from 4.6, re-tune your prompts.

3. Better Memory Across Sessions

The model got better at using file system-based memory, meaning it remembers context across multi-session projects. Less “let me re-explain everything” at the start of each chat.

4. Tastier Creative Output

Anthropic says the model produces higher-quality interfaces, slides, and docs. Less “AI-looking,” more polished. This matters if you’re using Claude for anything client-facing.,

Pricing & Availability

Price: $5/M input tokens, $25/M output tokens (same as Opus 4.6)
API string: claude-opus-4-7
Available on: Claude.ai, API, Amazon Bedrock, Google Vertex AIv
Tokenizer change: Updated — same text may use 1.0–1.35x more tokens than 4.6

Migration note: The tokenizer change means your costs could creep up even at the same per-token price. Anthropic recommends measuring real-world token usage on your workloads before fully switching over.

Also launched yesterday:

xhigh effort level — a new setting between “high” and “max” for finer control over reasoning depth vs. speed
Task budgets (public beta) — guide Claude’s token spending across longer runs
/ultrareview in Claude Code — a dedicated review session that audits your code for bugs and design issues
Auto mode for Claude Code Max — lets Claude make permission decisions on your behalf for longer, uninterrupted task runs

The Bigger Picture

Three trends converging here:

1. The “unsupervised coding” bar keeps rising. Six months ago, handing an AI a complex multi-file refactor and walking away was a fantasy. Now the top models are genuinely getting there. Opus 4.7’s self-verification isn’t a gimmick — it’s the feature that makes long-running agentic work practical.

2. Capability shaping is the new safety conversation. Forget content filters and guardrails. Anthropic is experimenting with making models structurally less capable in dangerous domains. That’s a fundamentally different approach than “add a warning label.” If it works, expect every lab to follow.

3. The model treadmill is exhausting — but the gaps matter. Yes, there’s a new model every week. But Opus 4.7 vs. 4.6 isn’t incremental — 13% better on hard coding tasks, 3x vision resolution, and a new safety architecture. If you’re building on Claude, this is a real upgrade.

Should You Care?

If you code with Claude daily: Yes. Full stop. The self-verification and long-horizon improvements are real. Start with your hardest tasks and see if the claims hold up. Re-tune your prompts — the instruction following is tighter.

If you’re building AI agents: The vision upgrade (3x resolution) and better memory across sessions make computer-use agents significantly more viable. Worth testing.

If you’re watching the AI safety space: The Project Glasswing / capability shaping approach is a genuinely new move. Agree or disagree with it, but pay attention — this is likely where the industry is heading.

🔗 Anthropic’s official announcement | Project Glasswing | Cyber Verification Program

💬 Hit reply and tell me: are you switching to 4.7 immediately or waiting? I read every response.

Until next time,

John