Run 9a9a4d0e-ca6…
completed 2c09af0e-90e1-4d79-8f05-bb867284cf1eRun success rate
Tasks the run solved on its last attempt / tasks attempted in this run.
Formula: COUNT(distinct tasks where last attempt passed) / COUNT(distinct tasks attempted in this run)
Per-run metric for the model's "final answer" on each task. Differs from leaderboard pass_at_n: this denominator is the run's own attempted-task count, not the task set size, so partial runs are not penalised for unattempted tasks.
Avg attempt score
Mean per-attempt score on a 0–100 point scale (partial credit). Drill-down only.
Formula: Mean of attempt scores across all results rows: SUM(score) / COUNT(*) over the results table. Each attempt earns 0–100 points based on compile + test outcomes.
Drill-down companion to pass_at_n. Rewards partial credit but not directly comparable to pass rate; use for within-model analysis.
| Task | Difficulty | Attempt | Score | Tests | Compile | Duration | |
|---|---|---|---|---|---|---|---|
| CG-AL-E001 | easy | 1 | 100.0 / 100 | 7/7 | OK | 3m 8s | |
| CG-AL-E002 | easy | 2 | 0.0 / 100 | 0/0 | FAIL | 21m 32s | |
| CG-AL-E003 | easy | 1 | 100.0 / 100 | 5/5 | OK | 2m 50s | |
| CG-AL-E004 | easy | 2 | 100.0 / 100 | 6/6 | OK | 3m 48s | |
| CG-AL-E005 | easy | 2 | 100.0 / 100 | 13/13 | OK | 4m 10s | |
| CG-AL-E006 | easy | 2 | 0.0 / 100 | 0/0 | FAIL | 1m 45s | |
| CG-AL-E007 | easy | 1 | 100.0 / 100 | 7/7 | OK | 3m 55s | |
| CG-AL-E008 | easy | 2 | 0.0 / 100 | 0/0 | FAIL | 1m 47s | |
| CG-AL-E009 | easy | 2 | 0.0 / 100 | 0/0 | FAIL | 1m 40s | |
| CG-AL-E010 | easy | 2 | 100.0 / 100 | 5/5 | OK | 3m 7s | |
| CG-AL-E031 | easy | 1 | 100.0 / 100 | 3/3 | OK | 4m 57s | |
| CG-AL-E032 | easy | 1 | 100.0 / 100 | 1/1 | OK | 4m 36s | |
| CG-AL-E045 | easy | 1 | 100.0 / 100 | 4/4 | OK | 3m 13s | |
| CG-AL-E050 | easy | 2 | 0.0 / 100 | 0/0 | FAIL | 1m 21s | |
| CG-AL-E051 | easy | 2 | 62.5 / 100 | 3/15 | OK | 4m 47s | |
| CG-AL-E052 | easy | 2 | 100.0 / 100 | 16/16 | OK | 3m 47s | |
| CG-AL-E053 | easy | 1 | 100.0 / 100 | 3/3 | OK | 3m 15s | |
| CG-AL-E054 | easy | 2 | 100.0 / 100 | 9/9 | OK | 3m 21s | |
| CG-AL-E055 | easy | 1 | 100.0 / 100 | 8/8 | OK | 3m 7s | |
| CG-AL-H001 | easy | 1 | 100.0 / 100 | 25/25 | OK | 3m 22s | |
| CG-AL-H002 | easy | 1 | 100.0 / 100 | 4/4 | OK | 3m 36s | |
| CG-AL-H003 | easy | 2 | 100.0 / 100 | 5/5 | OK | 3m 47s | |
| CG-AL-H004 | easy | 2 | 100.0 / 100 | 14/14 | OK | 3m 20s | |
| CG-AL-H005 | easy | 2 | 62.5 / 100 | 2/5 | OK | 4m 51s | |
| CG-AL-H006 | easy | 2 | 100.0 / 100 | 6/6 | OK | 2m 58s | |
| CG-AL-H007 | easy | 2 | 0.0 / 100 | 0/0 | FAIL | 1m 56s | |
| CG-AL-H008 | easy | 1 | 100.0 / 100 | 10/10 | OK | 3m 29s | |
| CG-AL-H009 | easy | 1 | 100.0 / 100 | 11/11 | OK | 3m 46s | |
| CG-AL-H010 | easy | 1 | 100.0 / 100 | 8/8 | OK | 3m 29s | |
| CG-AL-H011 | easy | 2 | 0.0 / 100 | 0/0 | FAIL | 58.8s | |
| CG-AL-H013 | easy | 1 | 100.0 / 100 | 9/9 | OK | 3m 47s | |
| CG-AL-H014 | easy | 2 | 0.0 / 100 | 0/0 | FAIL | 56.4s | |
| CG-AL-H015 | easy | 2 | 100.0 / 100 | 4/4 | OK | 3m 12s | |
| CG-AL-H016 | easy | 2 | 0.0 / 100 | 0/0 | FAIL | 47.2s | |
| CG-AL-H017 | easy | 2 | 0.0 / 100 | 0/0 | FAIL | 1m 33s | |
| CG-AL-H018 | easy | 2 | 0.0 / 100 | 0/0 | FAIL | 1m 18s | |
| CG-AL-H019 | easy | 2 | 100.0 / 100 | 5/5 | OK | 3m 27s | |
| CG-AL-H020 | easy | 2 | 0.0 / 100 | 0/0 | FAIL | 1m 58s | |
| CG-AL-H021 | easy | 2 | 62.5 / 100 | 6/20 | OK | 3m 48s | |
| CG-AL-H022 | easy | 2 | 0.0 / 100 | 0/0 | FAIL | 2m 10s | |
| CG-AL-H023 | easy | 2 | 0.0 / 100 | 0/0 | FAIL | 4m 0s | |
| CG-AL-H024 | easy | 2 | 62.5 / 100 | 8/9 | OK | 4m 29s | |
| CG-AL-H025 | easy | 2 | 100.0 / 100 | 7/7 | OK | 4m 1s | |
| CG-AL-H026 | easy | 2 | 100.0 / 100 | 8/8 | OK | 3m 14s | |
| CG-AL-H205 | easy | 2 | 100.0 / 100 | 6/6 | OK | 3m 45s | |
| CG-AL-M001 | easy | 2 | 0.0 / 100 | 0/0 | FAIL | 1m 45s | |
| CG-AL-M002 | easy | 2 | 100.0 / 100 | 22/22 | OK | 3m 47s | |
| CG-AL-M003 | easy | 2 | 62.5 / 100 | 8/9 | OK | 3m 52s | |
| CG-AL-M004 | easy | 2 | 0.0 / 100 | 0/0 | FAIL | 1m 11s | |
| CG-AL-M005 | easy | 2 | 0.0 / 100 | 0/0 | FAIL | 2m 25s | |
| CG-AL-M006 | easy | 2 | 62.5 / 100 | 15/18 | OK | 4m 5s | |
| CG-AL-M007 | easy | 2 | 0.0 / 100 | 0/0 | FAIL | 2m 28s | |
| CG-AL-M008 | easy | 2 | 0.0 / 100 | 0/0 | FAIL | 2m 44s | |
| CG-AL-M009 | easy | 2 | 0.0 / 100 | 0/0 | FAIL | 2m 25s | |
| CG-AL-M010 | easy | 2 | 62.5 / 100 | 19/21 | OK | 6m 32s | |
| CG-AL-M020 | easy | 2 | 0.0 / 100 | 0/0 | FAIL | 2m 45s | |
| CG-AL-M021 | easy | 2 | 0.0 / 100 | 0/0 | FAIL | 3m 1s | |
| CG-AL-M022 | easy | 2 | 0.0 / 100 | 0/0 | FAIL | 1m 39s | |
| CG-AL-M023 | easy | 1 | 100.0 / 100 | 11/11 | OK | 4m 25s | |
| CG-AL-M024 | easy | 2 | 100.0 / 100 | 10/10 | OK | 3m 49s | |
| CG-AL-M025 | easy | 2 | 62.5 / 100 | 2/7 | OK | 4m 4s | |
| CG-AL-M026 | easy | 1 | 100.0 / 100 | 8/8 | OK | 4m 6s | |
| CG-AL-M088 | easy | 1 | 100.0 / 100 | 6/6 | OK | 3m 38s | |
| CG-AL-M112 | easy | 2 | 62.5 / 100 | 2/4 | OK | 4m 15s |