Compare

Pick 2–4 models. Per-task scores below.

openai/gpt-5.5

Pick at least two models to compare

Add a slug to the input above, or jump straight in via URL. For example ?models=sonnet-4-7,gpt-5.