Models

Cursor launches Composer 2.5

Strong model. Near Anthropic Opus-level performance and cheaper per task. An important launch to stay competitive with Codex and Claude Code as one of the leading developer tools.

Published 18 May 2026 7 min

On May 18, Cursor released Composer 2.5, its latest in-house coding model for long agent tasks in the editor. It is both a product upgrade and a response to pressure that has been building since Codex and Claude Code gained more users.

Cursor's own numbers place Composer 2.5 close to leading SOTA models such as Anthropic Opus 4.7 on Terminal-Bench 2.0, ahead of GPT-5.5 on SWE-Bench Multilingual, and clearly above their earlier Composer 2. Cursor now has a near-frontier model in-house that it can serve directly to users.

Composer 2.5 benchmark results table for Terminal-Bench 2.0, SWE-Bench Multilingual, and CursorBench — Composer 2.5 benchmark results from Cursor's launch post. Anthropic Opus 4.7 and GPT-5.5 use self-reported scores on some public evals. Source: Cursor blog, May 18, 2026.

On Cursor's price-versus-score chart, the picture is clear. Composer 2.5 lands at about 63% on CursorBench, Cursor's own benchmark for coding models, for roughly $0.50 per task. Quality sits close to frontier models, but a solved task costs less.

CursorBench score versus average cost per task for Composer 2.5, Composer 2, Anthropic Opus 4.7, and GPT-5.5 — Efficiency curve: Composer 2.5 scores about 63% on CursorBench at roughly $0.50 per task. Source: Cursor blog, May 18, 2026.

The incentive for Cursor to build its own models from the start was subscriptions. The editor sold agent access to developers but paid standard prices when the model behind the run came from OpenAI or Anthropic. Codex and Claude Code come from OpenAI and Anthropic themselves, and can bundle far more API usage into their plans. On Anthropic's Max plan from $100, heavy Claude Code use can translate to around $5,000 in API spend. Pay Cursor around $100 a month and that only covers about $100 worth of usage without its own Composer models to choose from.

Composer 1 was the first answer. Composer 2 raised the ceiling. Composer 2.5 is the launch that makes Cursor competitive again.

Under the hood, Composer 2.5 still builds on Moonshot's open-source Kimi K2.5 family, like Composer 2 before it. Cursor CEO Michael Truell also pointed to the next step: a partnership with SpaceX to train a much larger model from scratch on Colossus 2, Elon Musk's AI compute cluster. That is separate from the Composer 2.5 developers can use today, but it follows the same logic. Own the model, own the economics, own the performance curve.