9 – 13 Overfitting Intro V4 Final

When we minimize the network error using backpropagation, we may either properly fit the model to the data or overfit. Generally speaking, when we have a finite training set, there’s a risk of overfitting. Overfitting means that our model will fit the training data too closely. In other words, we over trained the model or … Read more

8 – 08 Backpropagation Theory V6 Final

Now that we completed a feedforward pass, received an output, and calculated the error, we are ready to go backwards in order to change our weights with a goal of decreasing the network error. Going backwards from the output to the input while changing the weights, is a process we call back propagation, which is … Read more

5 – 05 RNN FFNN Reminder B V6 Final

Let’s look at a basic model of an artificial neural network, where we have only a single, hidden layer. The inputs are each connected to the neurons in the hidden layer and the neurons in the hidden layer are each connected to the neurons in the output layer where each neuron there represents a single … Read more

4 – 04 RNN FFNN Reminder A V7 Final

Before we dive into RNNs, let’s remember the process we use in feedforward neural networks. We can have many hidden layers between the inputs and the outputs, but for simplicity, we will start with a single hidden layer. We will remind ourselves why, when, and how it is used. After we have a clear understanding … Read more

3 – 03 RNN Applications V3 Final

To give you an idea of how useful RNNs and LSTMs are let’s take a sneak peek. The world’s leading tech companies are all using RNNs and LSTMs in their applications. Let’s take a look at some of those. Speech recognition, where a sequence of data samples extracted from an audio signal is continuously mapped … Read more

21 – 23 From RNNs To LSTMs V4 Final

The Long Short-Term Memory cells, or LSTM cells, were proposed in 1997 by Sepp Hochreiter and Jürgen Schmidhuber. The goal of the cell is to overcome the vanishing gradient problem. You will see that it allows certain inputs to be latched, or stored, for long periods of time without forgetting them as would be the … Read more

20 – RNN Summary

To summarize what we’ve discussed, we now understand that in RNNs, the current state depends on the inputs as well as on the previous states, with the use of an activation function, like the Hyperbolic Tangent, the Sigmoid or the ReLU function for example. The current output is a simple linear combination of the current … Read more

2 – 02 RNN History V4 Final

After the first wave of artificial neural networks in the mid 80s, it became clear that feedforward networks are limited since they are unable to capture temporal dependencies, which, as we said before, are dependencies that change over time. Modeling temporal data is critical in most real-world applications, since natural signals like speech and video … Read more

17 – 19 RNN BPTT A V6 Final

Hopefully, you are now feeling more confident and have a deeper conceptual understanding of RNNS. But how do we train such networks? How can we find a good set of weights that would minimize the error? You will see that our training framework will be similar to what we’ve seen before, with a slight change … Read more

16 – 18 RNN Example V5 Final

Let’s continue with a conceptual RNN example. Assume that we want to build a sequence detector, and let’s decide that our sequence detector will track letters. So we will actually build a word detector. And more specifically, we want our network to detect the word, Udacity. Just the word. So before we start, we need … Read more

15 – 17 RNN Unfolded V3 Final

The unfolding in time scheme can be confusing. So let’s go back for a bit, look at it closely, and see what’s actually going on there. First, we will take the Elman network, and tilted by 90 degrees counter-clockwise. As in RNNs, we usually display the flow of information from the bottom to the top. … Read more

14 – 16 RNN B V4 Final

In feedforward neural networks, the output at any time is a function of the current input and the weights alone. We assume that the inputs are independent of each other. Therefore, there is no significance to the sequence. So we actually train the system by randomly drawing inputs and target pairs. In RNNs, our output … Read more

13 – 14 RNN A V4 Final

We are finally ready to talk about Recurrent Neural Networks or RNN’s in short. Everything we’ve seen so far prepared us for this moment. We went over the feedforward process, as well as the back propagation process in much detail. This will all help you understand the next set of videos. As I mentioned before, … Read more

12 – 12 Backpropagation Example B V6 Final

We now need to calculate the gradient. We will do that one step at a time. In our example, we only have one hidden layer. So the back propagation process will have two steps. Let’s be more precise now and decide that the gradient calculated for each element Ij in the matrix is called delta … Read more

10 – 10 Backpropagation Example A V3 Final

Remember our feedforward illustration? We had n inputs, three neurons in the hidden layer, and two outputs. For this example, we will need to simplify things even more and look at a model with two inputs, x_1 and x_2, and a single output, y. We will have a weight matrix, W_1, from the input to … Read more