AI Experiments
Riftrunner Cracks a Mate-in-2 Chess Puzzle: Outsmarts Claude 4.5
RiftRunner Team
••5 min read#riftrunner#LMArena#Gemini 3#chess puzzle#Claude 4.5#AI reasoning
Riftrunner Cracks a Mate-in-2 Chess Puzzle: Outsmarts Claude 4.5
This was an insanely difficult task for an LLM: build a mate-in-2 chess puzzle, think like a chess player before coding, and avoid tactical traps. Riftrunner on LMArena delivered; Claude 4.5 stalled.
LMArena run showing riftrunner’s reasoning path and final mate-in-2 construction.
Why this puzzle is hard for LLMs
- Needs full-board reasoning and pruning non-forcing lines.
- Demands consistency between move generation and final FEN output.
- Requires verifying “only line” mates, not just flashy checks.
What riftrunner did differently
- Plan-then-code: laid out threat + defensive resources before encoding moves.
- Constraint checks: verified no dual solutions and protected against stalemate sidesteps.
- Clean notation: SAN + FEN stayed aligned, simplifying validation.
Outcome vs Claude 4.5
- Riftrunner: produced a legal mate-in-2 with forcing lines explained.
- Claude 4.5: drifted into non-forcing checks and inconsistent board states.
- Takeaway: riftrunner feels tuned for structured reasoning on tight combinatorial tasks.
Tips to replicate
- Ask for the plan first: threat, key squares, and opponent replies.
- Request legality checks and a final FEN/SAN pair for validation.
- Keep the prompt short—clarity beats verbosity for chess reasoning.