Since graduating last year, my CS has got a bit rusty – working full-time, there's not much time to learn much dense theory. But there's always time to play a quick board game. Mancala is an ancient game with really simple rules which I've been playing since I was a kid.

A Mancala board before the first move

The board looks like thisꜛ: twelve small pots and one big pot on each side. To start with, each small pot is filled with three* marbles.

*Although the rules are simple, there are hundreds of variations: from the number of marbles each player starts with, to the rules for capturing, it seems like nobody can agree on the “real” rules. Anyway, I've always found this variant to be fun and challenging!

Each player controls the six small pots closest to them. On their turn, a player can choose one of their (non-empty) pots, and distribute the marbles in the pot anti-clockwise, one at a time.

For example, moving pot #5:

If your last marble falls into an empty pot on your side, and the opposite pot has marbles in it, you capture those marbles – clear out both pots and put them in your store to the right.

And one last rule – if your last marble falls into your store, you get another turn!

A Quick Primer on Game AI

While playing Mancala with myself in lockdown is fun for a little while, it's much more fun to have an opponent.

In university we learned about a game AI technique called MiniMax (and its less fashionable younger sibling, MaxiMin). Today, I'll be using MaxiMin to try and maximise the minimum possible score achievable by every move.

For example, let's imagine a simple two-player game where we both have two choices – A or B – and the game has a running score every turn.

Let's say that we're playing this game, and the current score is zero. By choosing A, I  know the score will become +1 in my favour; by choosing B I know it'll be +4 in my favour. Both options are great, but option B seems like the smartest one!

Greedy game-playing

That's looking only 1 move ahead – it's what we might call a "greedy" strategy; taking the highest total score we possibly can each turn.

This is a half-decent strategy, and it's how most people start learning to play games like Mancala! In a lot of situations, though, humans can beat it by thinking a couple of moves ahead.

For example, let's look a couple of moves into the future.

Game tree showing the four possible outcomes of two opposing moves

If we're playing Green, it's the "greedy" choice to pick option B straight away. But then, regrettably... it's Blue's turn. Blue is trying to minimise our score. So Blue could choose B, but why would they do us any favours? If they're playing well, they'll chose A, and we'll end up on a score of -3 after two moves.

This is the core idea of MaxiMin – assume your opponent is playing perfectly, and maximise the score you're guaranteed to get. In this case, you can pick option A and guarantee that you'll have a score of at least -1. Not great, but not as bad as if you'd picked B!

Applying AI to Mancala

The great thing about Mancala is that there's at most 6 choices you have at any time. So you can relatively easily look 7 or 8 moves into the future without much computation time at all – it's under a couple of million configurations, which V8 crunches through in no time!

The only minor complication to just plugging a Mancala game into MaxiMin is when you can repeat a turn – but that's not too hard to account for in code.

The algorithm evaluates each move recursively. On the first player's turns, it picks the best moves for the first player; on the second player's turns, it picks the best moves for the second player. By doing this, it works out the optimal strategy to play against a completely rational opponent.

The base case of this recursive evaluation is to calculate the "final score" of the board – counting the marbles in both players' pots, combined with their stores, and taking the difference between the two. This base case is reached either when the depth limit is exceeded, or when no more moves are possible.

I've included this code below. The updateBoard function, which immutably creates a new board from the current board and a player/move, also returns a nextPlayer value. By passing down this, as well as the player whose score we're maximising, into the recursive call, we don't need to worry if the moves don't strictly alternate. At every level, we know whether to pick the move with the highest or lowest score!

Here's the full code to the game, including logic for calculating the next board states, captures and so on – it's a bit rough, but it gets the job done. I even did a few unit tests – which is honestly more than I expected of myself for a fun little side-project...

https://gist.github.com/bedekelly/5f1d566df8368a8f08fec4acaec13aab

What we've all been waiting for...

So, we've built this AI– how does it do in real games? Well, anecdotally, it's scary good: I'm not too bad at the game, but it kicks my butt almost every time. Playing the best game I can, and using the opening I'm most familiar with, it beat me 23-13!

If you're interested to see how my Mancala-bot plays when it can see 8 moves in the future, try dragging the slider below to see me get owned in real-time:

Thanks for getting this far – you're a trooper! If you've enjoyed this, or you think you can beat my Mancala AI, come follow me on Twitter (@bedekelly) – I generally post small coding projects I've been working on like this or this, and always include a link to the source code. Hopefully you'll see something you like!