Is Aider really 4.2x more token-efficient than Claude Code?

The 4.2x figure comes from [morphllm.com's diff benchmark](https://www.morphllm.com/comparisons/morph-vs-aider-diff), comparing aider's tree-sitter-based repo map and SEARCH/REPLACE edit format against Claude Code's tool-call-based file editing on equivalent tasks. The number is reproducible on similar refactoring tasks but it's not universal — long agentic runs with multiple verification steps narrow the gap. The honest framing: aider is meaningfully cheaper for the same edit, but not 4.2x cheaper on every workflow.

Does Aider work with Claude models?

Yes. Aider supports Claude (Opus, Sonnet, Haiku), GPT-5.4, DeepSeek, Gemini, o3, and basically any LLM with a chat API. Paul Gauthier maintains the polyglot leaderboard at [aider.chat/docs/leaderboards](https://aider.chat/docs/leaderboards/) — the top configs as of May 2026 are o3-pro (85%), o3-high + gpt-4.1 (83%), and R1 + Sonnet architect-editor (64%). Claude Code is Anthropic-only by design.

Can I use aider's CONVENTIONS.md with Claude Code?

Not directly — aider reads files via /read or --read-only flags; Claude Code reads CLAUDE.md at session start. But the content is portable: the rules and conventions you write for one work in the other. Many teams keep AGENTS.md as the source of truth, symlink it to CLAUDE.md for Claude Code, and pass /read AGENTS.md to aider.

Does Claude Code have an equivalent of aider's architect/editor mode?

Partially. Claude Code's subagent pattern (one Opus-level agent dispatches to Sonnet-level workers via the Task tool) is conceptually similar to aider's architect/editor split (one model plans, another executes). The mechanics differ: aider runs two distinct API calls with explicit roles; Claude Code dispatches subagents inside one session with shared context. Both produce the same outcome — a planning model that doesn't waste tokens on execution detail.

Which one should I use for autonomous overnight runs?

Claude Code, by a clear margin. Its hooks system (PreToolUse, PostToolUse, Stop) makes it safe to leave unattended — Stop hooks can refuse session end until verification passes. Aider doesn't have an equivalent supervision primitive. For 8-hour autonomous research or audit runs, Claude Code is the only realistic choice between these two.

Why is aider's polyglot benchmark so widely cited?

Paul Gauthier updates it weekly with reproducible scores across 225+ coding tasks in multiple languages. The methodology is open. It's effectively the canonical model-comparison benchmark for coding work — every other 'X vs Y' coding benchmark either copies aider's setup or is dismissed. The benchmark is the source of authority that makes aider's vendor moat unusually strong on 'aider X' SERPs.

Aider vs Claude Code (May 2026): Token Cost vs Agent Loop

Aider is the terminal pair-programmer that takes git seriously and treats every change as a commit. Claude Code is the agent platform with subagents, hooks, skills, and a 1M-context Opus model. They share a terminal, an LLM call, and a love of CLI minimalism — and they're optimized for fundamentally different workflows.

The honest framing: aider is a sharp knife. Claude Code is a kitchen.

Who wins at what

Aider wins on token cost per task (the 4.2x figure from morphllm is reproducible), git-first discipline (every change is a commit, every commit has a message), provider flexibility (use Claude, GPT, DeepSeek, Gemini, or local models), and pair-programming feel. Claude Code wins on subagents and parallelization, the skills/hooks/plugins ecosystem, long autonomous runs, and multi-step verification work.

The table above isn't hedged — each row has a specific reason.

Where aider wins

Token cost. Aider's SEARCH/REPLACE edit format and tree-sitter repo map are dramatically more efficient than Claude Code's tool-call-based editing on equivalent tasks. The morphllm.com benchmark reports 4.2x fewer tokens per task. On a $40 Claude Code refactor, that's a $10 aider equivalent. Multiply by your team and the savings compound. The catch: this is for edit-shaped work. Long agentic runs with multi-step research narrow the gap. Git-first discipline. Every aider change is a git commit with a message Paul Gauthier's design intentionally enforces. This is a feature, not friction. Reverting an aider session is git reset --hard HEAD~5. Reviewing what changed is git log --oneline. The discipline keeps both you and the LLM honest. Claude Code can be configured this way via hooks, but aider is this way by default. Provider flexibility. Aider supports Claude (any model), GPT (any model), DeepSeek, Gemini, o3, Qwen, and local models via Ollama. Paul Gauthier maintains the polyglot leaderboard with weekly updates. As of May 2026 the top configs are o3-pro (85%), o3-high + gpt-4.1 (83%), and R1 + Sonnet architect-editor (64%). If you want to use whatever model is best this week, aider lets you. Claude Code is Anthropic-only by design. Pair-programming feel. Aider's UX is: you sit in a terminal, you describe a change in natural language, aider proposes a diff with a commit message, you accept or reject. The loop is tight, deterministic, and feels like working with a fast colleague. Claude Code's loop is more agentic — propose, plan, execute, verify — which is more powerful but less interactive. The architect/editor pattern. Aider's architect mode runs two models: one plans the change (slower, smarter), another executes it (faster, cheaper). Paul Gauthier publishes SOTA configs for this — R1 + Sonnet, o3-high + gpt-4.1, o3-pro. The token-cost / quality tradeoff is uniquely aider's strength.

Where Claude Code wins

Subagents and parallelization. Claude Code's Task tool spawns isolated-context subagents. This is the right primitive for "audit these five flows in parallel" or "research these three SDKs and report back." Aider doesn't have this — it's a single-session pair programmer. For multi-track work, Claude Code is structurally faster. The skills, hooks, and plugins ecosystem. ~/.claude/skills, /.claude/skills, plugins with skill+agent+hook bundles, PreToolUse/PostToolUse/Stop hooks, settings.json permission rules — Claude Code has a real platform around it. Obra's Superpowers (120k+ stars), Boris Cherny's CLAUDE.md examples, the wshobson/agents and wshobson/commands repos, claudemarketplaces.com's 4,000+ skills — none of this exists for aider. Aider has CONVENTIONS.md and that's mostly the platform. Long-horizon autonomous runs. Claude Code's Stop hook can refuse session end if verification fails. Subagents work in parallel. CLAUDE.md persists across sessions. Skills auto-trigger based on the task description. This is the right shape for "run unattended for 4-8 hours" — and aider, by design, doesn't try to be this. Multi-step verification. Claude Code's hook lifecycle makes it natural to bolt deterministic checks onto a probabilistic agent. The "verification-first workflow" pattern (HN 46934254) is a Claude Code idiom. Aider's verification model is git: commit, test, revert. Both work; Claude Code's is more flexible for non-test verification. Enterprise-grade settings. Managed settings via enterprise scope, deny rules, permission allowlists, plugin marketplaces — Claude Code has the primitives for org-wide deployment. Intercom's Brian Scanlan publicly described their internal Claude Code platform: "13 plugins, 100+ skills, hooks that turn Claude into a full-stack engineering platform." Aider is one developer's tool by design.

Where the comparison gets uncomfortable

Both have a benchmark problem. Aider has the polyglot leaderboard, which is authoritative but tests models, not the agent layer around them. Claude Code has no public benchmark of equivalent rigor. Comparing them on "which writes better code" is a model question (Opus 4.7 vs whatever you run aider on); comparing them on "which solves the task end-to-end" is what nobody has rigorously measured. Both are fast-moving targets. Aider added CONVENTIONS.md support in 2024, the polyglot benchmark in 2024, architect mode in early 2025, and there's an open issue (#4363) to adopt AGENTS.md. Claude Code shipped skills in late 2025, plugins in October 2025, 1M context in April 2026, and 4.7 in late 2025. Any specific feature gap you see today might close next month. The 4.2x token figure is contested in long-horizon work. On edit-shaped tasks, aider's diff format wins decisively. On research tasks with many tool calls — web search, reading docs, multi-file synthesis — Claude Code's agent loop is structurally better-suited even if more expensive. Pick the right tool for the task shape, not for the headline number.

When to use both together

The pattern that holds up for sophisticated terminal users:

Aider for committed-quality code edits — the 80% of work where the change is "modify these three files in this specific way" and you want it as a clean git commit.
Claude Code for the 20% of work that's agentic — multi-file refactors, audits, overnight runs, research, hook-enforced verification.
AGENTS.md as the shared config — aider reads it via /read AGENTS.md, Claude Code reads it via the symlink workaround. (Aider's issue #4363 tracks native support.)
Same CONVENTIONS content in both — if you wrote rules for one, they probably work in the other.

Whichever you pick, RuleSell's catalog ships rule packs and convention bundles for both. The content is portable; the wrapper is per-tool.

Where this comparison fails / what we don't know

We didn't benchmark token consumption on identical tasks across both tools. The 4.2x figure comes from morphllm's diff benchmark — a real measurement, but on a specific task class. A full end-to-end benchmark of "implement this feature, with tests, from a spec" hasn't been published for either tool at fair settings. Treat token-cost claims as directional, not exact.

We also don't know how aider evolves if/when AGENTS.md adoption lands (issue #4363 has been open since late 2025). If aider natively reads AGENTS.md, the cross-tool story gets cleaner and one of Claude Code's ecosystem advantages narrows.

Sources

Paul Gauthier. "aider polyglot leaderboard."
Paul Gauthier. "Architect/editor SOTA configs."
morphllm. "Aider vs Claude Code diff benchmark — 4.2x token efficiency."
Aider AGENTS.md adoption — GitHub issue #4363
Brian Scanlan (Intercom). "13 plugins, 100+ skills."
Boris Cherny. "Opus 4.7 in Claude Code — better at long-running work."
"Verification-first workflow plugin for Claude Code"
"Claude Code vs aider HN thread."

Aider vs Claude Code (May 2026): Token Cost vs Agent Loop

Who wins at what

Who wins at what

Where aider wins

Where Claude Code wins

Where the comparison gets uncomfortable

When to use both together

Where this comparison fails / what we don't know

What to read next

Sources

Frequently asked

Related topics