Run beb015b1-d92…
completed 2c09af0e-90e1-4d79-8f05-bb867284cf1eRun success rate
Tasks the run solved on its last attempt / tasks attempted in this run.
Formula: COUNT(distinct tasks where last attempt passed) / COUNT(distinct tasks attempted in this run)
Per-run metric for the model's "final answer" on each task. Differs from leaderboard pass_at_n: this denominator is the run's own attempted-task count, not the task set size, so partial runs are not penalised for unattempted tasks.
Avg attempt score
Mean per-attempt score on a 0–100 point scale (partial credit). Drill-down only.
Formula: Mean of attempt scores across all results rows: SUM(score) / COUNT(*) over the results table. Each attempt earns 0–100 points based on compile + test outcomes.
Drill-down companion to pass_at_n. Rewards partial credit but not directly comparable to pass rate; use for within-model analysis.
| Task | Difficulty | Attempt | Score | Tests | Compile | Duration | |
|---|---|---|---|---|---|---|---|
| CG-AL-E001 | easy | 1 | 100.0 / 100 | 7/7 | OK | 15.5s | |
| CG-AL-E002 | easy | 1 | 100.0 / 100 | 6/6 | OK | 2m 35s | |
| CG-AL-E003 | easy | 1 | 100.0 / 100 | 5/5 | OK | 19.2s | |
| CG-AL-E004 | easy | 1 | 100.0 / 100 | 6/6 | OK | 24.3s | |
| CG-AL-E005 | easy | 2 | 0.0 / 100 | 0/0 | FAIL | 29.8s | |
| CG-AL-E006 | easy | 2 | 100.0 / 100 | 7/7 | OK | 2m 42s | |
| CG-AL-E007 | easy | 2 | 100.0 / 100 | 7/7 | OK | 32.4s | |
| CG-AL-E008 | easy | 1 | 100.0 / 100 | 6/6 | OK | 41.8s | |
| CG-AL-E009 | easy | 1 | 100.0 / 100 | 5/5 | OK | 43.7s | |
| CG-AL-E010 | easy | 1 | 100.0 / 100 | 5/5 | OK | 48.7s | |
| CG-AL-E031 | easy | 1 | 100.0 / 100 | 3/3 | OK | 50.9s | |
| CG-AL-E032 | easy | 1 | 100.0 / 100 | 1/1 | OK | 48.4s | |
| CG-AL-E045 | easy | 1 | 100.0 / 100 | 4/4 | OK | 14.9s | |
| CG-AL-E050 | easy | 2 | 0.0 / 100 | 0/0 | FAIL | 8.2s | |
| CG-AL-E051 | easy | 2 | 62.5 / 100 | 3/15 | OK | 19.1s | |
| CG-AL-E052 | easy | 2 | 100.0 / 100 | 16/16 | OK | 15.6s | |
| CG-AL-E053 | easy | 2 | 100.0 / 100 | 3/3 | OK | 2m 18s | |
| CG-AL-E054 | easy | 2 | 0.0 / 100 | 0/0 | FAIL | 6.3s | |
| CG-AL-E055 | easy | 1 | 100.0 / 100 | 8/8 | OK | 14.8s | |
| CG-AL-E056 | easy | 2 | 0.0 / 100 | 0/0 | FAIL | 7.5s | |
| CG-AL-E057 | easy | 2 | 0.0 / 100 | 0/0 | FAIL | 6.6s | |
| CG-AL-E058 | easy | 2 | 100.0 / 100 | 1/1 | OK | 14.4s | |
| CG-AL-H001 | hard | 1 | 100.0 / 100 | 25/25 | OK | 17.8s | |
| CG-AL-H002 | hard | 1 | 100.0 / 100 | 4/4 | OK | 15.7s | |
| CG-AL-H003 | hard | 1 | 100.0 / 100 | 5/5 | OK | 16.6s | |
| CG-AL-H004 | hard | 1 | 100.0 / 100 | 14/14 | OK | 16.3s | |
| CG-AL-H005 | hard | 2 | 100.0 / 100 | 6/6 | OK | 16.8s | |
| CG-AL-H006 | hard | 1 | 100.0 / 100 | 6/6 | OK | 14.7s | |
| CG-AL-H007 | hard | 2 | 100.0 / 100 | 10/10 | OK | 15.6s | |
| CG-AL-H008 | hard | 2 | 0.0 / 100 | 0/0 | FAIL | 9.0s | |
| CG-AL-H009 | hard | 1 | 100.0 / 100 | 11/11 | OK | 16.7s | |
| CG-AL-H010 | hard | 1 | 100.0 / 100 | 8/8 | OK | 15.2s | |
| CG-AL-H011 | hard | 2 | 0.0 / 100 | 0/0 | FAIL | 5.9s | |
| CG-AL-H013 | hard | 1 | 100.0 / 100 | 9/9 | OK | 14.9s | |
| CG-AL-H014 | hard | 2 | 0.0 / 100 | 0/0 | FAIL | 7.5s | |
| CG-AL-H015 | hard | 1 | 100.0 / 100 | 4/4 | OK | 14.6s | |
| CG-AL-H016 | hard | 2 | 0.0 / 100 | 0/0 | FAIL | 7.2s | |
| CG-AL-H017 | hard | 2 | 0.0 / 100 | 0/0 | FAIL | 6.2s | |
| CG-AL-H018 | hard | 2 | 100.0 / 100 | 6/6 | OK | 15.9s | |
| CG-AL-H019 | hard | 2 | 100.0 / 100 | 5/5 | OK | 14.2s | |
| CG-AL-H020 | hard | 2 | 100.0 / 100 | 10/10 | OK | 17.7s | |
| CG-AL-H021 | hard | 2 | 0.0 / 100 | 0/0 | FAIL | 13.1s | |
| CG-AL-H022 | hard | 2 | 62.5 / 100 | 17/21 | OK | 17.6s | |
| CG-AL-H023 | hard | 2 | 0.0 / 100 | 0/0 | FAIL | 1m 32s | |
| CG-AL-H024 | hard | 1 | 100.0 / 100 | 9/9 | OK | 18.6s | |
| CG-AL-H025 | hard | 1 | 100.0 / 100 | 7/7 | OK | 35.4s | |
| CG-AL-H026 | hard | 1 | 100.0 / 100 | 8/8 | OK | 14.7s | |
| CG-AL-H027 | hard | 2 | 62.5 / 100 | 2/4 | OK | 15.7s | |
| CG-AL-H028 | hard | 1 | 100.0 / 100 | 9/9 | OK | 16.6s | |
| CG-AL-H029 | hard | 2 | 62.5 / 100 | 16/18 | OK | 19.6s | |
| CG-AL-H030 | hard | 2 | 62.5 / 100 | 9/10 | OK | 16.4s | |
| CG-AL-H031 | hard | 2 | 0.0 / 100 | 0/0 | FAIL | 14.7s | |
| CG-AL-H032 | hard | 2 | 100.0 / 100 | 19/19 | OK | 19.8s | |
| CG-AL-H033 | hard | 1 | 100.0 / 100 | 5/5 | OK | 2m 25s | |
| CG-AL-H034 | hard | 2 | 100.0 / 100 | 3/3 | OK | 13.6s | |
| CG-AL-H035 | hard | 1 | 100.0 / 100 | 3/3 | OK | 15.3s | |
| CG-AL-H036 | hard | 2 | 0.0 / 100 | 0/0 | FAIL | 6.7s | |
| CG-AL-H037 | hard | 2 | 0.0 / 100 | 0/0 | FAIL | 6.5s | |
| CG-AL-H038 | hard | 2 | 0.0 / 100 | 0/0 | FAIL | 7.3s | |
| CG-AL-H039 | hard | 1 | 100.0 / 100 | 4/4 | OK | 14.7s | |
| CG-AL-H040 | hard | 1 | 100.0 / 100 | 2/2 | OK | 14.2s | |
| CG-AL-H041 | hard | 2 | 0.0 / 100 | 0/0 | FAIL | 9.6s | |
| CG-AL-H042 | hard | 2 | 0.0 / 100 | 0/0 | FAIL | 6.0s | |
| CG-AL-H043 | hard | 1 | 100.0 / 100 | 5/5 | OK | 15.6s | |
| CG-AL-H050 | hard | 2 | 62.5 / 100 | 2/3 | OK | 17.4s | |
| CG-AL-H051 | hard | 1 | 100.0 / 100 | 4/4 | OK | 14.5s | |
| CG-AL-H052 | hard | 1 | 100.0 / 100 | 5/5 | OK | 17.0s | |
| CG-AL-H053 | hard | 2 | 100.0 / 100 | 4/4 | OK | 17.6s | |
| CG-AL-H054 | hard | 2 | 100.0 / 100 | 5/5 | OK | 19.2s | |
| CG-AL-H056 | hard | 2 | 0.0 / 100 | 0/0 | FAIL | 6.1s | |
| CG-AL-H057 | hard | 1 | 100.0 / 100 | 4/4 | OK | 2m 26s | |
| CG-AL-H058 | hard | 1 | 100.0 / 100 | 5/5 | OK | 14.6s | |
| CG-AL-H205 | hard | 2 | 100.0 / 100 | 6/6 | OK | 1m 19s | |
| CG-AL-M001 | medium | 2 | 0.0 / 100 | 0/0 | FAIL | 7.4s | |
| CG-AL-M002 | medium | 2 | 62.5 / 100 | 21/22 | OK | 18.7s | |
| CG-AL-M003 | medium | 2 | 0.0 / 100 | 0/0 | FAIL | 8.9s | |
| CG-AL-M004 | medium | 1 | 100.0 / 100 | 12/12 | OK | 2m 38s | |
| CG-AL-M005 | medium | 2 | 0.0 / 100 | 0/0 | FAIL | 15.7s | |
| CG-AL-M006 | medium | 1 | 100.0 / 100 | 18/18 | OK | 20.4s | |
| CG-AL-M007 | medium | 2 | 0.0 / 100 | 0/0 | FAIL | 16.7s | |
| CG-AL-M008 | medium | 2 | 0.0 / 100 | 0/0 | FAIL | 1m 25s | |
| CG-AL-M009 | medium | 1 | 100.0 / 100 | 11/11 | OK | 31.0s | |
| CG-AL-M010 | medium | 2 | 100.0 / 100 | 21/21 | OK | 3m 15s | |
| CG-AL-M020 | medium | 2 | 0.0 / 100 | 0/0 | FAIL | 7.9s | |
| CG-AL-M021 | medium | 2 | 0.0 / 100 | 0/0 | FAIL | 10.8s | |
| CG-AL-M022 | medium | 2 | 100.0 / 100 | 9/9 | OK | 17.4s | |
| CG-AL-M023 | medium | 2 | 62.5 / 100 | 10/11 | OK | 16.3s | |
| CG-AL-M024 | medium | 2 | 100.0 / 100 | 10/10 | OK | 19.6s | |
| CG-AL-M025 | medium | 2 | 62.5 / 100 | 2/7 | OK | 15.1s | |
| CG-AL-M026 | medium | 1 | 100.0 / 100 | 8/8 | OK | 15.7s | |
| CG-AL-M027 | medium | 2 | 0.0 / 100 | 0/0 | FAIL | 21.7s | |
| CG-AL-M028 | medium | 2 | 100.0 / 100 | 3/3 | OK | 2m 25s | |
| CG-AL-M029 | medium | 2 | 100.0 / 100 | 3/3 | OK | 2m 18s | |
| CG-AL-M031 | medium | 2 | 0.0 / 100 | 0/0 | FAIL | 8.3s | |
| CG-AL-M032 | medium | 2 | 100.0 / 100 | 3/3 | OK | 15.5s | |
| CG-AL-M033 | medium | 2 | 100.0 / 100 | 2/2 | OK | 14.6s | |
| CG-AL-M034 | medium | 2 | 62.5 / 100 | 0/2 | OK | 13.7s | |
| CG-AL-M035 | medium | 1 | 100.0 / 100 | 2/2 | OK | 14.2s | |
| CG-AL-M036 | medium | 2 | 0.0 / 100 | 0/0 | FAIL | 10.8s | |
| CG-AL-M037 | medium | 2 | 0.0 / 100 | 0/0 | FAIL | 6.2s | |
| CG-AL-M038 | medium | 1 | 100.0 / 100 | 2/2 | OK | 16.5s | |
| CG-AL-M039 | medium | 1 | 100.0 / 100 | 2/2 | OK | 2m 50s | |
| CG-AL-M040 | medium | 2 | 0.0 / 100 | 0/0 | FAIL | 6.0s | |
| CG-AL-M041 | medium | 2 | 0.0 / 100 | 0/0 | FAIL | 5.7s | |
| CG-AL-M042 | medium | 2 | 100.0 / 100 | 8/8 | OK | 21.9s | |
| CG-AL-M043 | medium | 2 | 62.5 / 100 | 0/5 | OK | 14.9s | |
| CG-AL-M044 | medium | 1 | 100.0 / 100 | 6/6 | OK | 2m 29s | |
| CG-AL-M045 | medium | 2 | 100.0 / 100 | 6/6 | OK | 14.9s | |
| CG-AL-M088 | medium | 1 | 100.0 / 100 | 6/6 | OK | 18.5s | |
| CG-AL-M112 | medium | 2 | 100.0 / 100 | 4/4 | OK | 16.5s |