Reinforcement Learning Algorithms on Tic-Tac-Toe

As part of a research project for CS 6314: Dynamic Programming and Reinforcement Learning, we aimed to develop an agent capable of playing different versions of Tic-Tac-Toe.

We addressed this challenge with the following goals:

  1. Train an agent to play 2D Tic-Tac-Toe (3x3).
  2. Extend the agent’s capabilities to 2D Tic-Tac-Toe (4x4).
  3. Further, train the agent for 3D Tic-Tac-Toe (4x4x4).

For this, we implemented algorithms like Value Iteration, Temporal Difference Learning, Deep Q Networks, and more, to tackle problems arising from vast state spaces. (Project Report), (Github Source Code).