In 2016, researchers at DeepMind announced a new breakthrough. The development of a new engine, AlphaGo for the game of Go. The AI was able to defeat a professional player Lee Sedol. The breakthrough was significant because Go was far more complex than chess. The number of possible games is so high that a professional Go engine was believed to be out of reach at that point, and human intuition was believed to be a key component of professional play. Still, performance in AlphaGo depends on expert input during the training step, and so the algorithm cannot be easily transferred to other domains. This changed in 2017 when the team at DeepMind updated their algorithm and developed a new engine called AlphaGo Zero. This time, instead of depending on expert game play for the training, AlphaGo Zero learned from playing against itself only knowing the rules of the game. More impressively, the algorithm was generic enough to be adapted to chess and shogi also known as Japanese chess. This leads to an entirely new way of developing AI engines, and the researchers call their algorithm Alpha Zero. The best part of the Alpha Zero algorithm is simplicity. It consists of a Monte Carlos research guided by a deep neural network. This is analogous to the way humans think about board games, where professional players employ hard calculations guided by intuitions. In the following lesson, we will cover how Alpha Zero algorithm works and implement it to play an advanced version of tic-tac-toe. So let’s get started.