1 – Introduction

Hello, I’m Jay. In this lesson, we will be talking about a number of different hyper parameters and identifying possible starting values and intuitions for a number of hyper parameters that you may have already come across, and that you’ll need to know in your work with deep learning. A hyper parameter is a variable that we need to set before applying a learning algorithm into a dataset. The challenge with hyper parameters is that there are no magic numbers that work everywhere. The best numbers depend on each task and each dataset. So in addition to talking about starting values, we’ll try to touch on the intuition of why we’d want to nudge a hyper parameter one way or another. Generally speaking, we can break hyper parameters down into two categories. The first category is optimizer hyper parameters. These are the variables related more to the optimization and training process than to the model itself. These include the learning rate, the minibatch size, and the number of training iterations or epochs. The second category is model hyper parameters. These are the variables that are more involved in the structure of the model. These include the number of layers and hidden units and model specific hyper parameters for architectures like RNMs. In the next video, we’ll start with the single most important hyper parameter of all, the learning rate.

%d 블로거가 이것을 좋아합니다: