Models grouped by family| Model | API ID | Score (all-time) | Runs | Last run |
|---|
| claude |
|---|
| claude-fable-5 | 79.8 / 100 | 3 | 9d ago |
|---|
| claude-haiku-4-5 | No runs | 0 | — |
|---|
| claude-haiku-4-5-20251001 | 44.7 / 100 | 6 | 21d ago |
|---|
| claude-opus-4-6 | 70.0 / 100 | 9 | 1mo ago |
|---|
| claude-opus-4-7 | 69.2 / 100 | 12 | 20d ago |
|---|
| claude-opus-4-8 | 67.8 / 100 | 6 | 20d ago |
|---|
| claude-sonnet-4-6 | 64.3 / 100 | 9 | 21d ago |
|---|
| deepseek |
|---|
| deepseek/deepseek-v4-pro | 22.6 / 100 | 6 | 1mo ago |
|---|
| gemini |
|---|
| gemini-3.1-pro-preview | 73.9 / 100 | 3 | 20d ago |
|---|
| gemini-3.5-flash | 57.5 / 100 | 6 | 20d ago |
|---|
| gemini-2.0-flash | No runs | 0 | — |
|---|
| gemini-2.5-pro | No runs | 0 | — |
|---|
| gpt |
|---|
| gpt-4o | No runs | 0 | — |
|---|
| gpt-5 | No runs | 0 | — |
|---|
| gpt-5.4 | 60.9 / 100 | 3 | 1mo ago |
|---|
| gpt-5.5 | 58.6 / 100 | 9 | 1mo ago |
|---|
| grok |
|---|
| x-ai/grok-4.3 | 59.1 / 100 | 3 | 1mo ago |
|---|
| qwen3.5 |
|---|
| qwen/qwen3.5-plus-20260420 | 38.4 / 100 | 3 | 1mo ago |
|---|
| qwen3.6 |
|---|
| qwen/qwen3.6-35b-a3b | No runs | 0 | — |
|---|
| qwen/qwen3.6-max-preview | 56.1 / 100 | 3 | 1mo ago |
|---|