AI Experiments

Riftrunner Cracks a Mate-in-2 Chess Puzzle: Outsmarts Claude 4.5

RiftRunner Team

•November 18, 2025•5 min read

#riftrunner#LMArena#Gemini 3#chess puzzle#Claude 4.5#AI reasoning

Riftrunner Cracks a Mate-in-2 Chess Puzzle: Outsmarts Claude 4.5

This was an insanely difficult task for an LLM: build a mate-in-2 chess puzzle, think like a chess player before coding, and avoid tactical traps. Riftrunner on LMArena delivered; Claude 4.5 stalled.

LMArena run showing riftrunner’s reasoning path and final mate-in-2 construction.

Why this puzzle is hard for LLMs

Needs full-board reasoning and pruning non-forcing lines.

Demands consistency between move generation and final FEN output.

Requires verifying “only line” mates, not just flashy checks.

What riftrunner did differently

Plan-then-code: laid out threat + defensive resources before encoding moves.

Constraint checks: verified no dual solutions and protected against stalemate sidesteps.

Clean notation: SAN + FEN stayed aligned, simplifying validation.

Outcome vs Claude 4.5

Riftrunner: produced a legal mate-in-2 with forcing lines explained.

Claude 4.5: drifted into non-forcing checks and inconsistent board states.

Takeaway: riftrunner feels tuned for structured reasoning on tight combinatorial tasks.

Tips to replicate

Ask for the plan first: threat, key squares, and opponent replies.

Request legality checks and a final FEN/SAN pair for validation.

Keep the prompt short—clarity beats verbosity for chess reasoning.