Explainable Reinforcement Learning in Board Games
AI-powered coaching for Chess, Checkers, Go, Othello, and Sudoku — with natural language explanations powered by DSPy.
GameSage is a system where AI can play board games at a human-competitive level, explain its strategic reasoning in natural language, provide real-time coaching and feedback for human players, and make reinforcement learning interpretable and informative in the context of board games. Our goal is to bridge the gap between powerful AI game-playing systems and human understanding — making AI decisions transparent and educational rather than opaque.
Built on Stanford's DSPy framework, GameSage uses structured prompt optimization to deliver skill-level-appropriate coaching across five board games. The system features a modular adapter architecture that separates game logic from AI reasoning: the game engine always validates moves, while the LLM focuses purely on strategic analysis and natural language explanation. This separation ensures correctness while maximizing explainability.
GameSage supports multiple LLM backends (Ollama, OpenAI, Anthropic, Google Gemini) and includes a full training pipeline that optimizes prompts using curated game positions. Every session is logged to a research database, capturing board states, LLM reasoning, and whether players followed AI advice — enabling systematic study of how humans interact with explainable AI coaching systems.
Chess, Checkers, Go, Othello, and Sudoku — each with a custom engine and tailored AI explanations.
Play against the AI, get coaching commentary, analyze positions, or solve puzzles with guided hints.
Every move recommendation comes with plain-English reasoning tailored to your skill level.
Prompt optimization via BootstrapFewShot and MIPROv2 improves advice quality with curated training data.
Each game has a dedicated engine that manages board state, validates moves, and enforces rules. The LLM never touches game logic directly.
The MoveAdvisor signature feeds serialized board state, legal moves, and player skill level into a ChainOfThought module for step-by-step strategic reasoning.
If the LLM suggests an illegal move, the system retries with negative feedback up to 3 times, then falls back to a random legal move.
Every session, move, board state, and LLM explanation is recorded to SQLite for analysis of human-AI coaching interactions.
Senior, Computer Science. Co-developer of the GameSage system, focusing on game engine implementations, the DSPy optimization pipeline, and explainability research.
Senior, Computer Science. Co-developer of the GameSage system, working on the multi-backend LLM architecture, training data curation, and the interactive coaching interface.
GameSage is open source and available on GitHub. The codebase is approximately 2,800 lines of Python, organized into modular packages.
dspy-game-box/
├── gamesage/
│ ├── main.py # Entry point and CLI
│ ├── config.py # LLM backend configuration
│ ├── core/
│ │ ├── adapter.py # GameAdapter abstract base class
│ │ ├── explainer.py # DSPy signatures and modules
│ │ └── serializer.py # Board-to-text utilities
│ ├── games/
│ │ ├── chess/ # python-chess wrapper
│ │ ├── checkers/ # Custom 8x8 engine
│ │ ├── go/ # Custom numpy-based engine
│ │ ├── othello/ # Custom implementation
│ │ └── sudoku/ # Generator + solver
│ ├── data/ # Training pipeline
│ ├── ui/cli.py # Rich terminal interface
│ └── research/logger.py # SQLite research logger
└── gamesage_data/ # Training data and compiled models
Current version of the CS 396 project proposal, covering motivation, related work, system design, and research methodology.
Download Report (PDF)Slides covering project vision, last semester's Othello prototype, the GameSage architecture, DSPy pipeline, game modes, and related work.
View Slides (PDF)