Coding
Coding
How well models write, edit, and debug real code.
Last updated · May 08, 2026, 12:40 UTC
| Rank | Model | Composite | Trend |
|---|---|---|---|
| 1 | Claude Opus 4.6limited Anthropic | 75.6 | |
| 2 | GLM-5limited Z.ai | 72.8 | |
| 3 | GPT-5 OpenAI | 66.2 | |
| 4 | Qwen3-235B-A22Blimited Alibaba | 65.9 | |
| 5 | Claude Opus 4.5 Anthropic | 64.9 | |
| 6 | Gemini 3 Pro Google | 63.4 | |
| 7 | o4-mini OpenAI | 63.2 | |
| 8 | o3 OpenAI | 62.3 | |
| 9 | GPT-5.2 OpenAI | 61.8 | |
| 10 | Gemini 3 Flash Google | 60.7 | |
| 11 | Claude Sonnet 4.5 Anthropic | 58.4 | |
| 12 | GPT-4.1 OpenAI | 58.0 | |
| 13 | Grok 4limited xAI | 57.9 | |
| 14 | Doubao-Seed-Code ByteDance | 57.2 | |
| 15 | Claude Sonnet 4 Anthropic | 56.9 | |
| 16 | Claude Opus 4limited Anthropic | 56.6 | |
| 17 | GPT-5.1 OpenAI | 56.3 | |
| 18 | Gemini 2.5 Pro Google | 55.5 | |
| 19 | DeepSeek V3.2 DeepSeek | 54.9 | |
| 20 | Kimi K2 Moonshot AI | 53.5 | |
| 21 | Claude 3.7 Sonnet Anthropic | 47.0 | |
| 22 | Claude 3.5 Sonnet Anthropic | 44.7 | |
| 23 | DeepSeek R1limited DeepSeek | 40.6 | |
| 24 | GPT-5 Codexlimited OpenAI | 38.9 | |
| 25 | DeepSeek V3.2 Specialelimited DeepSeek | 37.9 | |
| 26 | o3-minilimited OpenAI | 36.8 | |
| 27 | DeepSeek V3 DeepSeek | 31.3 | |
| 28 | Gemini 2.5 Flash Google | 25.8 | |
| 29 | GPT-4o OpenAI | 22.8 | |
| 30 | Grok 3limited xAI | 19.8 | |
| 31 | Llama 4 Mavericklimited Meta | 15.6 |