rankings

Coding Arena: The Glass

task vglass-fill-1.0.0

One coding prompt, given to every model word for word. Each card runs that model's answer, live and sandboxed. No edits, no fixes. What it shipped is what you click.

The task

Build a 2D game in a single self-contained HTML file (HTML + CSS + JS, no external libraries, no network). On screen is an empty glass. Each click pours water in; the glass fills up over repeated clicks, with a visible animated water level. Include a way to empty/reset it. Keep it to one file that runs by just opening it in a browser.

15 models

Reference

reference
_reference
house
loading…
tokens
cost
timen/a
code193lines

The target, written as the spec for this task, not a model entry. It sets the bar the models are measured against.

gpt-5.4-nano

gpt-5.4-nano
openai
loading…
tokens8,349157→8,192
cost$0.0033$0.4/M
time52.1s
code823lines

Auto-recorded. Grade chips by hand.

0 votes

gpt-5-nano

gpt-5-nano
openai
loading…
tokens3,391157→3,234
cost$0.0014$0.4/M
time25.4s
code218lines

Auto-recorded. Grade chips by hand.

0 votes

gpt-4o

gpt-4o
openai
loading…
tokens663170→493
cost$0.0033$5/M
time3.7s
code79lines

Auto-recorded. Grade chips by hand.

0 votes

claude-opus-4-7

claude-opus-4-7
anthropic
loading…
tokens6,779231→6,548
cost$0.1017$15/M
time70.6s
code468lines

Auto-recorded. Grade chips by hand.

0 votes

GLM-5.2

GLM-5.2
zai
loading…
tokens3,565160→3,405
cost$0.0036$1/M
time50.3s
code386lines
0 votes

gpt-5.5

gpt-5.5
openai
loading…
tokens4,334157→4,177
cost$0.0433$10/M
time41.7s
code520lines
0 votes

claude-opus-4-6

claude-opus-4-6
anthropic
loading…
tokens5,522165→5,357
cost$0.0055$1/M
time60.5s
code568lines
0 votes

GLM-4.6

GLM-4.6
zai
loading…
tokens4,362153→4,209
cost$0.0044$1/M
time59.3s
code405lines
0 votes

GLM-4.5

GLM-4.5
zai
loading…
tokens2,720154→2,566
cost$0.0027$1/M
time46.8s
code163lines
0 votes

gpt-5-mini

gpt-5-mini
openai
loading…
tokens7,270157→7,113
cost$0.0073$1/M
time108.2s
code665lines
0 votes

gpt-5.2

gpt-5.2
openai
loading…
tokens5,591157→5,434
cost$0.0056$1/M
time53.6s
code630lines
0 votes

claude-opus-4-8

claude-opus-4-8
anthropic
loading…
tokens3,804226→3,578
cost$0.0571$15/M
time34.1s
code230lines
0 votes

claude-sonnet-4-6

claude-sonnet-4-6
anthropic
loading…
tokens5,470165→5,305
cost$0.0055$1/M
time66.6s
code483lines
0 votes

claude-haiku-4-5

claude-haiku-4-5
anthropic
loading…
tokens2,497164→2,333
cost$0.0025$1/M
time11.7s
code282lines
0 votes

Each game runs in a sandboxed iframe (scripts only, no network, no access to this page). The numbers were measured when the game was recorded: tokens (input→output), cost (usage × the fixed price table), time (how long the model took), and code (raw line count).