ImageBench
ImageBench V1 is live

Independent benchmarks for AI image generation

6 models evaluated on 192 prompts across 6 categories. Know which model is best — for your use case, your budget, your quality bar.

6Models Evaluated
192Evaluations per Model
6Categories

V1 Leaderboard

192 prompts, 6 categories, graded pass/fail by VLM judges.

Full benchmark explorer
#ModelPass RatePass / FailAvg Latency
1fal-ai/nano-banana-2
94.8%
182/1028.1s
2fal-ai/nano-banana-pro
93.8%
180/1223.4s
3bfl/flux-2-max
91.7%
176/1626.7s
4bfl/flux-2-pro
82.3%
158/3411.8s
5bfl/flux-2-klein-9b
78.6%
151/414.1s
6bfl/flux-2-klein-4b
74.0%
142/503.8s

What we evaluate

Each model is tested across 6 categories with 192 prompts spanning easy to extreme difficulty.

Text Rendering
Typography accuracy, writing correctness across difficulty levels
Spatial Reasoning
Compositionality, counting, relative position, scale & proportions
Human Realism
Faces, expressions, hands, full body, multi-subject coherence
Truthfulness
Physics, reflections, photorealism, world knowledge
Professional Studio
Camera & lighting, color precision, photorealistic quality
Graphical Design
Layout, data visualisation, style diversity

Start learning

Comprehensive guides on image generation evaluation — from metrics to methodology.

Browse guides

Frequently asked questions

See how every model performs

Compare models side-by-side with our interactive benchmark explorer.

Explore ImageBench V1