9 – Character-Wise RNN

Coming up in this lesson, you’ll implement a character-wise RNN. That is the network will learn about some text one character at a time, and then generate new text one character at a time. Let’s say we want to generate a new Shakespeare plays. As an example, “To be or not to be.” We’d pass the sequence into our RNN one character at a time. Once trained, the network will generate new text by predicting the next character based on the characters it’s already seen. So then, to train this network, we want it to predict the next character in the input sequence. In this way, the network will learn to produce a sequence of characters that look like the original text. Let’s consider what the architecture of this network will look like. First, let’s unroll the RNN, so we can see how this all works as a sequence. Here, we have our input layer where we’ll pass in a characters as one-hot encoded vectors. These vectors go to the hidden layer. The hidden layer is built with LSTM cells where the hidden state and cell state pass from one cell to the next in the sequence. In practice, we’ll actually use multiple layers of LSTM cells. You just stack them up like this. The output of these cells go to the output layer. The output layer is used to predict the next character. We want the probabilities for each character the same way you did image classification with the covenant. That means that we want a softmax activation on the output. Our targets here will be the input sequence but shifted over one so that each character is predicting the next character in the sequence. Again, we’ll use cross entropy loss for training with gradient descent. When this network is trained up, we can pass in one character and get out a probability distribution for the likely next character. Then we can sample from that distribution to get the next character. Then we can take that character, pass it in, and get another one. We keep doing this and eventually we’ll build up some completely new text. We’ll be training this network on the text from Anna Karenina, one of my favorite books. It’s in the public domain, so it’s free to use however you want. Also, it’s an amazing novel.

Dr. Serendipity에서 더 알아보기

지금 구독하여 계속 읽고 전체 아카이브에 액세스하세요.

Continue reading