GPT
Trajectory
Concept trajectory
No baseline to compare against
This is the family's first analyzed generation. Once a second member is benched and analyzed, this section will surface the per-concept delta (resolved / persisting / regressed / new).
Members
| Model | Generation | Pass@N | Avg cost / task | Runs | Last run |
|---|---|---|---|---|---|
| 4 | — | — | 0 | — | |
| 5 | — | — | 0 | — | |
| 5 | 84.5% | $0.38 | 3 | 1mo ago | |
| 5 | 71.9% | $0.05 | 3 | 1mo ago |