Invent, benchmark, and analyze game-playing intelligence
Make it trivial to invent, iterate, and analyze board-game rules and AI play across many geometries—with reproducible, fast simulations. One engine, one pipeline: from custom games to public benchmarks.
What you can do
- Create games — Define chess- or go-like games with custom boards and pieces; describe in natural language and get TOML rules.
- Run simulations — Massive Monte Carlo runs to tune balance and discover interesting rule sets. Same seed, same replay.
- Explore the space — AI agents generate rule variants, run sims, and surface promising games for human review.
- Visualize and share — Preview legal moves and territory; share rules and replays.
- Benchmark models — See how code-generation models rank on our hidden game suite; rankings you can trust.
Mission
Graph-first engine, deterministic replays, confidence-aware scoring. We keep the core small and the periphery rich so that inventing, analyzing, and benchmarking game-playing intelligence stays reproducible and auditable.
Who it’s for
- Designers & researchers — Invent games, run batch sims, and interpret results from rule exploration.
- Model vendors — See how your models rank on our public code-generation benchmark.
- AI users — Decide which current AI performs best on our benchmark suite; use dated snapshots for reproducibility.
Trust
Deterministic simulations (same seed → same replay). For the public benchmark: hidden game suite so the field can’t overfit; dated snapshots and coverage metrics so you know when results are stable or provisional.