Back to Blog
AI Experiments

Riftrunner Cracks a Mate-in-2 Chess Puzzle: Outsmarts Claude 4.5

RiftRunner Team
5 min read
#riftrunner#LMArena#Gemini 3#chess puzzle#Claude 4.5#AI reasoning

Riftrunner Cracks a Mate-in-2 Chess Puzzle: Outsmarts Claude 4.5



This was an insanely difficult task for an LLM: build a mate-in-2 chess puzzle, think like a chess player before coding, and avoid tactical traps. Riftrunner on LMArena delivered; Claude 4.5 stalled.






LMArena run showing riftrunner’s reasoning path and final mate-in-2 construction.



Why this puzzle is hard for LLMs



  • Needs full-board reasoning and pruning non-forcing lines.

  • Demands consistency between move generation and final FEN output.

  • Requires verifying “only line” mates, not just flashy checks.



What riftrunner did differently



  • Plan-then-code: laid out threat + defensive resources before encoding moves.

  • Constraint checks: verified no dual solutions and protected against stalemate sidesteps.

  • Clean notation: SAN + FEN stayed aligned, simplifying validation.



Outcome vs Claude 4.5



  • Riftrunner: produced a legal mate-in-2 with forcing lines explained.

  • Claude 4.5: drifted into non-forcing checks and inconsistent board states.

  • Takeaway: riftrunner feels tuned for structured reasoning on tight combinatorial tasks.



Tips to replicate



  • Ask for the plan first: threat, key squares, and opponent replies.

  • Request legality checks and a final FEN/SAN pair for validation.

  • Keep the prompt short—clarity beats verbosity for chess reasoning.