Crucible compare LLMs by how they build
Illustrative sample data. Fictional model archetypes, hand-scored to demo the tool. Bring your own real outputs in the tab.

Blind Arena

Two anonymized answers to the same real task. Pick the one you'd ship — no logos, no hype. After enough votes, Crucible reveals the models and builds your ranking.

0votes

Keyboard: 1 left wins · 2 right wins · T tie · S skip