The last model we’ll introduce here are recurrent neural networks, which are used in natural language processing as well as time series. Recurrent neural networks are as the name implies neural networks. Neural networks are models that can be thought of as many regressions stacked in series and in parallel. When I say the regressions are in a series, I mean that the output of one regression is fed in as the input to another regression in a chain. Also, when I say parallel, I mean that there are multiple regressions whose outputs are fed into another layer of regressions. A recurrent neural network is called recurrent because it takes some of its intermediate output and uses this as part of its input as it trains itself on incoming data. The recurrent neural network takes an input and outputs in prediction. The recurrent neural network also outputs an additional signal and intermediate output which it feeds back into itself. At the next time step when the RNN receives another input from the data source, it uses both the input as well as its previous intermediate output to help it calculate its next prediction. You can think of this signal as a way for the RNN to remember relevant information from the past. To train a recurrent neural network, you feed it lots of data from the past, then you see how well it predicts data from the future using a process called gradient descent to adjust the coefficients in the network. One commonly used version of recurrent neural network is made up of one or more long short-term memory cells. The long short-term memory cell, or LSTM cell consists of several neural networks each that perform a specific task. The LSTM cell can be thought of as an assembly line, and specific tasks are performed by different assembly line workers. The LSTM cell takes data as input and also takes its own signals that it generated from the previous period. The signal that it takes from its previous period can be thought of as its memory from the past. Some of the assembly line workers, remove some memories. Other assembly line workers add more memories based on incoming data. Still other assembly line workers decide what to output. Remember that the recurrent neural network has two kinds of outputs. One, is its prediction for the variable in question. The other is an intermediate output for its future self. You can think of this intermediate output as the memory that it wishes to pass on to the next time period.