In this lesson, we will focus on recurrent neural networks, or what we call in short RNNs. Many applications involve temporal dependencies or dependencies over time. What does that mean? Well, it means our current output depends not only on the current input, but also on past inputs. So, if I want to make dinner tonight, let’s see, I had pizza yesterday, so maybe I should consider a salad. Essentially, what we will have is a network that is similar to feedforward networks that we’ve seen before, but with the addition of memory. You may have noticed, that in the applications you’ve seen before, only the current input mattered. For example, classifying a picture, is this a cat? But perhaps this is not a static cat. Maybe when the picture was shot, the cat was moving. From a single image, it may be difficult to determine if it is indeed walking or maybe it’s running. Maybe it’s just a very talented cat standing in a funny pose with two feet up in the air. When we observe the cat, frame by frame, we remember what we saw before, so we know if the cat is still or if it’s walking. We can also distinguish walking from running, but can the machine do the same? This is where recurrent neural networks or RNNs come in. RNNs are artificial neural networks, that can capture temporal dependencies, which are dependencies over time. If you look up the definition of the word recurrent, you will find that it simply means occurring often or repeatedly. So, why are these networks called recurrent neural networks? It’s simply because with RNNs we perform the same task for each element in the input sequence. In this lesson, we will start with a bit of history. It’s really interesting to see how this field has evolved. We will remind ourselves what feedforward networks are. You will see that RNNs are based on very similar ideas to those behind feedforward networks. So, once we have a clear understanding of the fundamentals, we can easily understand the next steps. While focusing on feedforward networks, you will learn about, non-linear function approximation, training using backpropagation, or what we call stochastic gradient descent, and evaluation. All these should be familiar to you. Our main focus of course will be RNNs. I have given you a good example of why we need RNNs. Remember the cat, but there are many other applications, we will focus on those as well. We will look at the simple RNN also known as the Elman network, and learn how to train the network. We will also understand the limitations of simple RNNs, and how they can be overcome by using what we call LSTMs. Don’t be alarmed, you don’t need to remember all these names right now. We will slowly progress into each concept and talk about all of them in much detail later.