x
Tic-Tac-Toe with Javascript ES2015

Part 3: AI Player with Minimax Algorithm

This is a part of the series: Tic-Tac-Toe with Javascript ES2015

In this part, you will learn how to implement the minimax algorithm in JavaScript to get the best move for the computer player. We will calculate each move’s heuristic value given some assumptions and a depth (the the number of turns to calculate).

View a demo or visit the project’s github page.

To create an AI player, we need to mimic how a human would think when playing a tic-tac-toe game. In real life, a human would think of all the possible consequences for each move. This is where the minimax algorithm comes handy. The minimax algorithm is a decision rule used to determine the best possible move in games where all possible moves can be foreseen like tic-tac-toe or chess.

 

Minimax Algorithm in Tic-Tac-Toe

To apply minimax algorithm in two-player games, we are going to assume that X is a maximizing player and O is a minimizing player. The maximizing player will try to maximize its score or in other words choose the move with the highest value. The minimizing player will try to minimize the value for the maximizing player, thus choosing the move with the minimum value.

In order to calculate the values mentioned above, we need to decide on some assumptions. We call these values heuristic values. In tic-tac-toe we have 3 possibilities:

  • The state of the board is a draw: We will give this board a value of 0;
  • X wins in a board state: We will give this board a value of 100;
  • O wins in a board state: We will give this board a value of -100;

 

Minimax Example with Game Tree

To illustrate the minimax algorithm more, let’s take a look at a visual example. In the diagram below, consider a situation where it’s X’s turn given the current state. X have three possible moves:

 

tic tac toe game tree

  • Level 1: X has three possible moves and tries to find the maximum node.
  • Level 2: The first move leads to direct win for X thus 100 points is given.
  • Level 2: The second and third moves will lead to two more possible moves where it’s O’s turn.
  • Level 3: O is trying to minimize the score so it chooses the nodes with the minimum value.
  • Level 3: The first move for O will lead to a win and the second to a draw, thus we assume that O is going to choose the first move and the parent node will have a value of -100. Same for the third and fourth moves.
  • Back to Level 1, X now has to choose between 100, -100 and 0. Since X is the maximizer, it will definitely choose 100 which will lead to a win.

As you can notice, we recursively propagate the possibilities tree calculating the score for each terminal state and then going back to decide which move we will take.

 

Adding Depth to the Calculations

Now imagine  a situation where X can win in two possible ways, one way takes more moves than the other. If we follow our current implementation, both moves will return a score of 100. We will then randomly choose the move. However, it would be better if we directly choose the shortest way to win.

To tackle that, we will subtract the depth or the current level from the score of the board in case the player is a maximizing one or add it in case it is a minimizing player. That way for a maximizing player the shorter path will have a higher score since it’s subtracted by a lower depth and vice versa for a minimizing player.

This way will also help lose in a longer way if there is a shorter way to loose. Here is a visual example after adding depth:

tic tac toe game tree depth example

In the case above, X will obviously choose 99 over 97 since it’s an easier way to win.

 

JavaScript Implementation

It’s time now to translate this theory into code. In src/classes in our directory create a new file called Player.js. The class will be constructed with a maximum depth argument. This argument will be used to limit how deep the computer will propagate through the tree. That way the lower depth we choose, the easier the game will be.

In addition to that, we will define a new map. The keys for this map will hold a certain heuristic value, and the value will hold a comma separated string for all the moves that resulted in that value. That way for a maximizing player we can choose the highest key and the move will be either the value or a random value if there are multiple values.

import Board from './Board';

class Player {
    constructor(max_depth = -1) {
        this.max_depth = max_depth;
        this.nodes_map = new Map();
    }
}

export default Player;

Now let’s attach a function to this class called getBestMove(). As mentioned, this is going to be a recursive function. This function will receive a board, a boolean to decide if the player is maximizing or minimizing, a callback function to run after calculation (this will be used when we build our UI) and the depth of the current node.

Inside our function, each call will have a different depth depending on the level we are currently at. In order to do some operation at the main function call i.e. at the very top level not a recursive call, we will check if the depth is equals to zero.

The first thing we will do inside the function is to clear the nodes_map map from previous calculations if we are calculating the value of the very top node.

Then we will add the base of our recursion function. Every recursive function must have a base or a point where the recursion will stop otherwise we might have an infinite recursion. In our case, the recursion will stop when a terminal state is reached or the depth reached the maximum depth. In that case we will return the heuristic value of the state:

getBestMove(board, maximizing = true, callback = () => {}, depth = 0) {
    
     //clear nodes_map if the function is called for a new move
    if(depth == 0) this.nodes_map.clear();
    
     //If the board state is a terminal one, return the heuristic value
    if(board.isTerminal() || depth == this.max_depth ) {
        if(board.isTerminal().winner == 'x') {
            return 100 - depth;
        } else if (board.isTerminal().winner == 'o') {
            return -100 + depth;
        } 
        return 0;
    }
}

Now we will check if the turn is for the maximizing player and the loop over all the empty cells using getAvailableMoves() we created in the previous tutorial. Inside the loop, we will create a new board and insert the symbol in the current empty cell in the loop and then call getBestMove() recursively but this time with the new board, the minimizing player turn and an incremented by one depth. Afterwards, we compare the output of the function with the current best value and update it if needed. Still inside the loop, we check if we are at the top level and if so we store our values in nodes_map.

Outside the loop, we check if we are at the top level and return the index of the cell that corresponds to the best value or a random index if multiple indices have the best value.

if(maximizing) {
	//Initializ best to the lowest possible value
	let best = -100;
	//Loop through all empty cells
	board.getAvailableMoves().forEach(index => {
		//Initialize a new board with the current state (slice() is used to create a new array and not modify the original)
		let child = new Board(board.state.slice());
		//Create a child node by inserting the maximizing symbol x into the current emoty cell
		child.insert('x', index);

		//Recursively calling getBestMove this time with the new board and minimizing turn and incrementing the depth
		let node_value = this.getBestMove(child, false, callback, depth + 1);
		//Updating best value
		best = Math.max(best, node_value);

		
		//If it's the main function call, not a recursive one, map each heuristic value with it's moves indicies
		if(depth == 0) {
			//Comma seperated indicies if multiple moves have the same heuristic value
			var moves = this.nodes_map.has(node_value) ? `${this.nodes_map.get(node_value)},${index}` : index;
			this.nodes_map.set(node_value, moves);
		}
	});
	//If it's the main call, return the index of the best move or a random index if multiple indicies have the same value
	if(depth == 0) {
		if(typeof this.nodes_map.get(best) == 'string') {
			var arr = this.nodes_map.get(best).split(',');
			var rand = Math.floor(Math.random() * arr.length);
			var ret = arr[rand];
		} else {
			ret = this.nodes_map.get(best);
		}
		//run a callback after calculation and return the index
		callback(ret);
		return ret;
	}
	//If not main call (recursive) return the heuristic value for next calculation
	return best;
}

Similarly, we will check if the player is minimizing and our code will be very similar, except for minor changes like inserting o instead of x and using Math.min instead of Math.max. Here is our final class:

import Board from './Board';

class Player {
	constructor(max_depth = -1) {
        this.max_depth = max_depth;
        this.nodes_map = new Map();
    }

	getBestMove(board, maximizing = true, callback = () => {}, depth = 0) {
		if(board.constructor.name !== "Board") throw('The first argument to the getBestMove method should be an instance of Board class.');

		if(depth == 0) this.nodes_map.clear();

		if(board.isTerminal() || depth == this.max_depth ) {
			if(board.isTerminal().winner == 'x') {
				return 100 - depth;
			} else if (board.isTerminal().winner == 'o') {
				return -100 + depth;
			} 
			return 0;
		}

		//Current player is maximizing
		if(maximizing) {
			//Initializ best to the lowest possible value
			let best = -100;
			//Loop through all empty cells
			board.getAvailableMoves().forEach(index => {
				//Initialize a new board with the current state (slice() is used to create a new array and not modify the original)
				let child = new Board(board.state.slice());
				//Create a child node by inserting the maximizing symbol x into the current emoty cell
				child.insert('x', index);

				//Recursively calling getBestMove this time with the new board and minimizing turn and incrementing the depth
				let node_value = this.getBestMove(child, false, callback, depth + 1);
				//Updating best value
				best = Math.max(best, node_value);

				
				//If it's the main function call, not a recursive one, map each heuristic value with it's moves indicies
				if(depth == 0) {
					//Comma seperated indicies if multiple moves have the same heuristic value
					var moves = this.nodes_map.has(node_value) ? `${this.nodes_map.get(node_value)},${index}` : index;
					this.nodes_map.set(node_value, moves);
				}
			});
			//If it's the main call, return the index of the best move or a random index if multiple indicies have the same value
			if(depth == 0) {
				if(typeof this.nodes_map.get(best) == 'string') {
					var arr = this.nodes_map.get(best).split(',');
					var rand = Math.floor(Math.random() * arr.length);
					var ret = arr[rand];
				} else {
					ret = this.nodes_map.get(best);
				}
				//run a callback after calculation and return the index
				callback(ret);
				return ret;
			}
			//If not main call (recursive) return the heuristic value for next calculation
			return best;
		}

		if(!maximizing) {
			//Initializ best to the highest possible value
			let best = 100;
			//Loop through all empty cells
			board.getAvailableMoves().forEach(index => {
				//Initialize a new board with the current state (slice() is used to create a new array and not modify the original)
				let child = new Board(board.state.slice());
				//Create a child node by inserting the minimizing symbol o into the current emoty cell
				child.insert('o', index);

			
				//Recursively calling getBestMove this time with the new board and maximizing turn and incrementing the depth
				let node_value = this.getBestMove(child, true, callback, depth + 1);
				//Updating best value
				best = Math.min(best, node_value);
				
				//If it's the main function call, not a recursive one, map each heuristic value with it's moves indicies
				if(depth == 0) {
					//Comma seperated indicies if multiple moves have the same heuristic value
					var moves = this.nodes_map.has(node_value) ? this.nodes_map.get(node_value) + ',' + index : index;
					this.nodes_map.set(node_value, moves);
				}
			});
			//If it's the main call, return the index of the best move or a random index if multiple indicies have the same value
			if(depth == 0) {
				if(typeof this.nodes_map.get(best) == 'string') {
					var arr = this.nodes_map.get(best).split(',');
					var rand = Math.floor(Math.random() * arr.length);
					var ret = arr[rand];
				} else {
					ret = this.nodes_map.get(best);
				}
				//run a callback after calculation and return the index
				callback(ret);
				return ret;
			}
			//If not main call (recursive) return the heuristic value for next calculation
			return best;
		}

	}
}

export default Player;

Let’s test this function now and also take a look at how the nodes_map map looks like. In index.js type:

import Board from './classes/Board';
import Player from './classes/Player';

let board = new Board(['x','o','','','','','o','','x']);
board.printFormattedBoard();

let p = new Player();
console.log(p.getBestMove(board));
console.log(p.nodes_map);

Player class console test

As you can see, cell 4 is decided as the best move for X since it will lead to a direct win. Let’s take a look at the same board but this time if it was O’s turn:

import Board from './classes/Board';
import Player from './classes/Player'

let board = new Board(['x','o','','','','','o','','x']);
board.printFormattedBoard()

let p = new Player();
console.log(p.getBestMove(board, false));
console.log(p.nodes_map);

Player class console test

 

Finally, it’s worth mentioning that this algorithm’s performance can be improved using alpha-beta pruning but I will leave that to you ?

In the next and final part, we will build the UI and interactions for our board.

Ali Alaa

Front-end developer from Egypt. Telecommunications Engineering graduate. Been working in web development since graduation.