9 – Character-Wise RNN

Coming up in this lesson, you’ll implement a character-wise RNN. That is the network will learn about some text one character at a time, and then generate new text one character at a time. Let’s say we want to generate a new Shakespeare plays. As an example, “To be or not to be.” We’d pass … Read more

8 – Putting It All Together

So here we go. As we’ve seen before, here is the architecture for an LSTM with the four gates. There is the forget gate, which takes the long-term memory and forgets part of it. The learn gate puts the short-term memory together with the event as the information we’ve recently learned. The remember gate joins … Read more

7 – LSTM 7 Use Gate

And finally, we come to the use gate or output gate. This is the one that uses the long term memory that just came out of the forget gate and the short term memory that just came out of the learned gate, to come up with a new short term memory and an output. These … Read more

6 – Remember Gate

And now we’re going to learn the Remember Gate. This one is the simplest. It take the long-term memory coming out of the Forget Gate and the short-term memory coming out of the Learn Gate and simply combines them together. And how does this work mathematically? Again, very simple. We just take the outputs coming … Read more

5 – Forget Gate

Now, we go to the Forget Gate, this one works as follows: It takes a long term memory and it decides what parts to keep and to forget. In this case, the show is about nature and science and the forget gate decides to forget that the show is about science and keep the fact … Read more

4 – Learn Gate

So, let’s keep this our base case. We have a long term memory which is at the show we’re watching it’s about nature and science. We also have a short term memory which is what we’ve recently seen, a squirrel and a tree. And finally, we have our current event which is a picture we … Read more

3 – LSTM Architecture

So in order to study the architecture of an LSTM, let’s quickly recall the architecture of an RNN. Basically what we do is we take our event E_t and our memory M_t-1, coming from the previous point in time, and we apply a simple tanh or sigmoid activation function to obtain the output and then … Read more

2 – LSTM Basics

So let’s recap. We have the following problem: we are watching a TV show and we have a long term memory which is that the show is about nature and science and lots of forest animal have appeared. We also have a short term memory which is what we have recently seen which is squirrels … Read more

11 – Other Architectures

In this video, I will show you a pair of similar architectures that also work well, but there are many variations to LSTMs and we encourage you to study them further. Here’s a simple architecture which also works well. It’s called the gated recurring unit or GRU for short. It combines the forget and the … Read more

10 – Sequence Batching

One of the most difficult parts of building networks for me is getting the batches right. It’s more of a programming challenge than anything deep learning specific. So here, I’m going to walk you through how batching works for RNNs. With RNNs, we’re training on sequences of data like text, stack values, audio, etc. By … Read more


Okay so, let’s say we have a regular neural network which recognizes images and we fitted this image. And the neural neural network guesses that the image is most likely a dog with a small chance of being a wolf and an even smaller chance of being a goldfish. But, what if this image is … Read more