Models grouped by family| Model | API ID | Score (all-time) | Runs | Last run |
|---|
| claude |
|---|
| claude-haiku-4-5 | No runs | 0 | — |
|---|
| claude-haiku-4-5-20251001 | 44.4 / 100 | 3 | 9d ago |
|---|
| claude-opus-4-6 | 69.5 / 100 | 6 | 10d ago |
|---|
| claude-opus-4-7 | 67.8 / 100 | 6 | 10d ago |
|---|
| claude-sonnet-4-6 | 65.0 / 100 | 3 | 10d ago |
|---|
| deepseek |
|---|
| deepseek/deepseek-v4-pro | 22.6 / 100 | 6 | 9d ago |
|---|
| gemini |
|---|
| gemini-3.1-pro-preview | No runs | 0 | — |
|---|
| gemini-2.0-flash | No runs | 0 | — |
|---|
| gemini-2.5-pro | No runs | 0 | — |
|---|
| gpt |
|---|
| gpt-4o | No runs | 0 | — |
|---|
| gpt-5 | No runs | 0 | — |
|---|
| gpt-5.4 | 60.9 / 100 | 3 | 9d ago |
|---|
| gpt-5.5 | 57.0 / 100 | 6 | 10d ago |
|---|
| grok |
|---|
| x-ai/grok-4.3 | 59.1 / 100 | 3 | 9d ago |
|---|
| qwen3.5 |
|---|
| qwen/qwen3.5-plus-20260420 | 38.4 / 100 | 3 | 9d ago |
|---|
| qwen3.6 |
|---|
| qwen/qwen3.6-35b-a3b | No runs | 0 | — |
|---|
| qwen/qwen3.6-max-preview | 56.1 / 100 | 3 | 9d ago |
|---|