5 – Forget Gate

Now, we go to the Forget Gate, this one works as follows: It takes a long term memory and it decides what parts to keep and to forget. In this case, the show is about nature and science and the forget gate decides to forget that the show is about science and keep the fact that it’s about nature. How does the Forget Gate work mathematically? Very simple. The long-term memory (LTM) from time T minus 1 comes in, and it gets multiplied by a Forget Factor ft. And how does the forget factor ft get calculated? Well, simple. We’ll use a short term memory STM and the event information to calculate ft. So, just as before, we run a small one layer neural network with a linear function combined with the sigmoid function to calculate this Forget Factor and that’s how the Forget Gate works.

