4-6-1-2. Single layer neural networks

Hello everyone and welcome to this lesson on deep learning with PyTorch.

So, in this lesson I’m going to be showing you how we can

build neural networks with pyTorch and train them.

By working through all these notebooks I built,

you’ll be writing the actual code yourself for building these networks.

By the end of the lesson,

you will have built your own state of the art image classifier.

But first we’re going to start with basics,

so how do you build just a simple neural network in pyTorch?

So, as a reminder of how neural networks work,

in general we have some input values so here x1,

x2, and we multiply them by some weights w and bias.

So, this b is this bias we just multiply it by one then you sum all these up

and you get some value h. Then we have what’s called an activation function.

So, here f of h and passing

these input values h through this activation function gets you output y.

This is the basis of neural networks.

You have these inputs,

you multiply it by some waves, take the sum,

pass it through some activation function and you get an output.

You can stack these up so that the output of these units,

of these neurons go to another layer like another set of weights.

So, mathematically this what it looks like, y,

our output is equal to this linear combination of the weights and

the input values w’s and x’s plus your bias value

b passes through your activation function f and you get y.

You could also write it with this sum.

So, sum of wi times xi and plus b, your bias term.

That gives you y.

So, what’s nice about this is that you can actually think of the x’s,

your input features, your values,

as a vector, and your weights as another vector.

So, your multiplication and sum is the same as a dot or inner product of two vectors.

So, if you consider your input as a vector and your weights as a vector,

if you take the dot product of these two,

then you get your value h and then you pass

h through your activation function and that gets you your output y.

So, now if we start thinking of our weights and our input values as vectors,

so vectors are an instance of a tensor.

So, a tensor is just a generalization of vectors and matrices.

So, when you have these like regular structured arrangements of

values and so a tensor with only one dimension is a vector.

So, we just have this single one-dimensional array of values.

So, in this case characters T-E-N-S-O-R. A matrix like this is

a two-dimensional tensor and so we have values going in

two directions from left to right and from top to

bottom and so that we have individual rows and columns.

So, you can do operations across the columns like

along a row or you can do it across the rows like going down a column.

You also have three-dimensional tensors so you can think of

an image like an RGB color image as a three-dimensional tensor.

So, for every pixel,

there’s some value for all the red and the green

and the blue channels and so for every individual pixel,

in a two-dimensional image,

you have three values.

So, that is a three-dimensional tensor.

Like I said before, tensors are a generalization of

this so you can actually have four-dimensional,

five-dimensional, six-dimensional, and so on like tensors.

It’s just the ones that we normally work with are

one and two-dimensional, three-dimensional tensors.

So, these tensors are the base data structure that you use

an pyTorch and other neural network frameworks.

So, TensorFlow is named after tensors.

So, these are the base data structures that

you’ll be using so you pretty much need to understand

them really well to be able to use

pretty much any framework that you’ll be using for deep learning.

So, let’s get started. I’m going to show you how to actually create

some tensors and use them to build a simple neural network.

So, first we’re going to import pyTorch and so just import torch here.

Here I am creating activation function,

so this is the Sigmoid activation function.

It’s the nice s shape that kind of squeezes the input values between zero and one.

It’s really useful for providing a probability.

So, probabilities are these values that can only be between zero and one.

So, you’re Sigmoid activation if you want

the output of your neural network to be a probability,

then the sigmoid activation is what you want to use.

So, here I’m going to create some fake data, I’m generating some data,

I’m generating some weights and biases and with these you’re actually going to

do the computations to get the output of a simple neural network.

So, here I’m just creating a manual seeds.

So, I’m setting the seed for the random number generation that I’ll

be using and here I’m creating features.

So, features are like the input features of the input data for your network.

Here we see torch.randn.

So, randn is going to create a tensor of normal variables.

So, random normal variables as samples from a normal distribution.

You give it a tuple of the size that you want.

So, in this case I want the features to be a matrix,

a 2-dimensional tensor of one row and five columns.

So, you can think of this as a row vector that has five elements.

For the weights, we’re going to create another matrix

of random normal variables and this time I’m using randn_like.

So, what this does is it takes

another tensor and it looks at the shape of this tensor and then it creates it,

it creates another tensor with the same shape.

So, that’s what this this like means.

So, I’m going to create a tensor of

random normal variables with the same shape as features. So, it gives me my weights.

Then I’m going to create a bias term.

So, this is again just a random normal variable.

Now I’m just creating one value.

So, this is one row and one column.

Here I’m going to leave this exercise up to you.

So, what you’re going to be doing is taking the features, weights,

and the bias tensors and you’re going to

calculate the output of this simple neural network.

So, remember with features and weights you want to take

the inner product or you want to multiply the features by

the weights and sum them up and then add the bias and then pass it through

the activation function and from that you should get the output of your network.

So, if you want to see how I did this,

checkout my solution notebook or watch

the next video which I’ll show you my solution for this exercise.

안녕하세요. PyTorch를 사용한 딥 러닝 강의에 오신 것을 환영합니다.

그래서 이번 강의에서는

pyTorch로 신경망을 구축하고 훈련하십시오.

내가 만든 이 모든 노트북을 통해 작업함으로써,

이러한 네트워크를 구축하기 위한 실제 코드를 직접 작성하게 됩니다.

수업이 끝날 무렵,

자신만의 최첨단 이미지 분류기를 구축하게 될 것입니다.

하지만 먼저 기본적인 것부터 시작하겠습니다.

그렇다면 pyTorch에서 간단한 신경망을 어떻게 구축합니까?

따라서 신경망이 작동하는 방식을 상기시키기 위해,

일반적으로 입력 값이 있으므로 여기에 x1,

x2이고 가중치 w와 편향을 곱합니다.

그래서, 이 b는 이 편향에 1을 곱한 다음 이 모든 것을 합산합니다.

그리고 당신은 어떤 가치를 얻습니다. h. 그런 다음 활성화 함수라고 하는 것이 있습니다.

그래서, 여기서 f의 h와 통과

이 활성화 함수를 통해 이러한 입력 값 h는 출력 y를 얻습니다.

이것이 신경망의 기초입니다.

당신은 이러한 입력을 가지고,

당신은 그것에 몇 가지 파도를 곱하고 합을 취하십시오.

일부 활성화 기능을 통해 전달하면 출력을 얻을 수 있습니다.

이러한 단위의 출력이,

이 뉴런 중 일부는 다른 가중치 세트와 같은 다른 레이어로 이동합니다.

수학적으로 이것은 다음과 같습니다. y,

우리의 출력은 가중치와

입력 값 w 및 x에 바이어스 값을 더한 값

b는 활성화 함수 f를 통과하고 y를 얻습니다.

이 금액으로 쓸 수도 있습니다.

그래서, wi 곱하기 xi와 더하기 b의 합, 당신의 편향 항.

그것은 당신에게 y를 제공합니다.

이것의 좋은 점은 실제로 x를 생각할 수 있다는 것입니다.

귀하의 입력 기능, 귀하의 가치,

벡터로, 가중치를 다른 벡터로.

따라서 곱셈과 합은 두 벡터의 내적 또는 내적과 같습니다.

따라서 입력을 벡터로, 가중치를 벡터로 간주하면

이 둘의 내적을 취하면

그런 다음 값 h를 얻은 다음 통과합니다.

h 활성화 함수를 통해 출력 y를 얻습니다.

이제 가중치와 입력 값을 벡터로 생각하기 시작하면

그래서 벡터는 텐서의 인스턴스입니다.

따라서 텐서는 벡터와 행렬의 일반화일 뿐입니다.

따라서 이러한 규칙적인 구조화된 배열이 있을 때

값이 있으므로 차원이 하나뿐인 텐서는 벡터입니다.

그래서 우리는 이 단일 1차원 값 배열만 가지고 있습니다.

따라서 이 경우 문자 T-E-N-S-O-R입니다. 이와 같은 행렬은

2차원 텐서이므로 값이 입력됩니다.

왼쪽에서 오른쪽으로, 위에서부터 두 방향으로

개별 행과 열을 갖도록 합니다.

따라서 다음과 같은 열에 걸쳐 작업을 수행할 수 있습니다.

행을 따라 또는 열을 따라 내려가는 것처럼 행을 가로질러 수행할 수 있습니다.

또한 3차원 텐서가 있으므로 다음을 생각할 수 있습니다.

RGB 컬러 이미지와 같은 이미지를 3차원 텐서로 표현합니다.

따라서 모든 픽셀에 대해

빨간색과 초록색 모두에 가치가 있습니다.

파란색 채널 등 모든 개별 픽셀에 대해

2차원 이미지에서

당신은 세 가지 가치를 가지고 있습니다.

그래서, 그것은 3차원 텐서입니다.

내가 전에 말했듯이, 텐서는 다음의 일반화입니다.

이렇게 하면 실제로 4차원을 가질 수 있습니다.

5차원, 6차원 등 텐서와 같습니다.

우리가 평소에 함께 일하는 사람들은

1차원 및 2차원, 3차원 텐서.

따라서 이러한 텐서는 사용하는 기본 데이터 구조입니다.

pyTorch 및 기타 신경망 프레임워크.

따라서 TensorFlow는 텐서의 이름을 따서 명명되었습니다.

따라서 다음은 기본 데이터 구조입니다.

당신은 꽤 많이 이해해야 사용할 것입니다

그들은 정말 잘 사용할 수 있습니다

딥 러닝에 사용할 거의 모든 프레임워크.

시작하겠습니다. 실제로 만드는 방법을 알려 드리겠습니다.

일부 텐서를 사용하여 간단한 신경망을 구축합니다.

따라서 먼저 pyTorch를 가져오고 여기에 토치를 가져옵니다.

여기에서 활성화 함수를 만들고 있습니다.

이것이 Sigmoid 활성화 함수입니다.

0과 1 사이의 입력 값을 짜내는 멋진 s 모양입니다.

확률을 제공하는 데 정말 유용합니다.

따라서 확률은 0과 1 사이에만 있을 수 있는 값입니다.

따라서 원하는 경우 Sigmoid 활성화입니다.

신경망의 출력은 확률로,

그런 다음 Sigmoid 활성화가 사용하려는 것입니다.

여기에서 가짜 데이터를 생성하고 데이터를 생성하겠습니다.

나는 약간의 가중치와 편향을 생성하고 있으며 이것들로 당신은 실제로

간단한 신경망의 출력을 얻기 위해 계산을 수행합니다.

그래서 여기에서는 수동 시드를 만들고 있습니다.

그래서 난수 생성을 위한 시드를 설정하고 있습니다.

사용 중이며 여기에서 기능을 만들고 있습니다.

따라서 기능은 네트워크에 대한 입력 데이터의 입력 기능과 같습니다.

여기서 우리는 torch.randn을 봅니다.

따라서 randn은 일반 변수의 텐서를 만들 것입니다.

따라서 정규 분포의 표본으로 임의의 정규 변수를 사용합니다.

원하는 크기의 튜플을 제공합니다.

따라서 이 경우에는 기능이 행렬이 되기를 원합니다.

1행 5열의 2차원 텐서.

따라서 이것을 5개의 요소가 있는 행 벡터로 생각할 수 있습니다.

가중치에 대해 다른 행렬을 만들 것입니다.

임의의 일반 변수와 이번에는 randn_like를 사용하고 있습니다.

그래서, 이것이 하는 일은

다른 텐서는 이 텐서의 모양을 보고 생성합니다.

같은 모양의 다른 텐서를 만듭니다.

이것이 바로 이것이 의미하는 바입니다.

텐서를 생성하겠습니다.

특징과 모양이 같은 랜덤 정규 변수. 그래서, 그것은 나에게 무게를 준다.

그런 다음 편향 용어를 만들겠습니다.

따라서 이것은 다시 무작위 일반 변수입니다.

이제 하나의 값을 생성할 뿐입니다.

따라서 이것은 하나의 행과 하나의 열입니다.

여기에서 이 연습은 여러분에게 맡기겠습니다.

그래서 여러분이 하려는 것은 특징, 가중치,

바이어스 텐서와

이 간단한 신경망의 출력을 계산합니다.

따라서 원하는 기능과 가중치를 기억하십시오.

내적 또는 기능을 곱하려는

가중치를 계산하고 합산한 다음 편향을 추가한 다음 통과시킵니다.

활성화 기능과 그로부터 네트워크의 출력을 얻어야 합니다.

그래서 제가 어떻게 했는지 보고 싶으시다면,

내 솔루션 노트북 또는 시계를 확인하십시오.

이 연습에 대한 솔루션을 보여드릴 다음 동영상입니다.

Introduction to Deep Learning with PyTorch

In this notebook, you’ll get introduced to PyTorch, a framework for building and training neural networks. PyTorch in a lot of ways behaves like the arrays you love from Numpy. These Numpy arrays, after all, are just tensors. PyTorch takes these tensors and makes it simple to move them to GPUs for the faster processing needed when training neural networks. It also provides a module that automatically calculates gradients (for backpropagation!) and another module specifically for building neural networks. All together, PyTorch ends up being more coherent with Python and the Numpy/Scipy stack compared to TensorFlow and other frameworks.

Neural Networks

Deep Learning is based on artificial neural networks which have been around in some form since the late 1950s. The networks are built from individual parts approximating neurons, typically called units or simply “neurons.” Each unit has some number of weighted inputs. These weighted inputs are summed together (a linear combination) then passed through an activation function to get the unit’s output.

Mathematically this looks like:

\begin{align} y &= f(w_1 x_1 + w_2 x_2 + b) \\ y &= f\left(\sum_i w_i x_i +b \right) \end{align}

With vectors this is the dot/inner product of two vectors:

h = \begin{bmatrix} x_1 \, x_2 \cdots x_n \end{bmatrix} \cdot \begin{bmatrix} w_1 \\ w_2 \\ \vdots \\ w_n \end{bmatrix}

Tensors

It turns out neural network computations are just a bunch of linear algebra operations on tensors, a generalization of matrices. A vector is a 1-dimensional tensor, a matrix is a 2-dimensional tensor, an array with three indices is a 3-dimensional tensor (RGB color images for example). The fundamental data structure for neural networks are tensors and PyTorch (as well as pretty much every other deep learning framework) is built around tensors.

With the basics covered, it’s time to explore how we can use PyTorch to build a simple neural network.

# First, import PyTorch
import torch

def activation(x):
    """ Sigmoid activation function 
    
        Arguments
        ---------
        x: torch.Tensor
    """
    return 1/(1+torch.exp(-x))
  
### Generate some data
torch.manual_seed(7) # Set the random seed so things are predictable

# Features are 3 random normal variables
features = torch.randn((1, 5))
# True weights for our data, random normal variables again
weights = torch.randn_like(features)
# and a true bias term
bias = torch.randn((1, 1))


## Calculate the output of this network using the weights and bias tensors



## Calculate the output of this network using matrix multiplication



### Generate some data
torch.manual_seed(7) # Set the random seed so things are predictable

# Features are 3 random normal variables
features = torch.randn((1, 3))

# Define the size of each layer in our network
n_input = features.shape[1]     # Number of input units, must match number of input features
n_hidden = 2                    # Number of hidden units 
n_output = 1                    # Number of output units

# Weights for inputs to hidden layer
W1 = torch.randn(n_input, n_hidden)
# Weights for hidden layer to output layer
W2 = torch.randn(n_hidden, n_output)

# and bias terms for hidden and output layers
B1 = torch.randn((1, n_hidden))
B2 = torch.randn((1, n_output))




# Exercise: Calculate the output for this multi-layer network using the weights W1 & W2, and the biases, B1 & B2.
## Your solution here




Above I generated data we can use to get the output of our simple network. This is all just random for now, going forward we’ll start using normal data. Going through each relevant line:

features = torch.randn((1, 5)) creates a tensor with shape (1, 5), one row and five columns, that contains values randomly distributed according to the normal distribution with a mean of zero and standard deviation of one.

weights = torch.randn_like(features) creates another tensor with the same shape as features, again containing values from a normal distribution.

Finally, bias = torch.randn((1, 1)) creates a single value from a normal distribution.

PyTorch tensors can be added, multiplied, subtracted, etc, just like Numpy arrays. In general, you’ll use PyTorch tensors pretty much the same way you’d use Numpy arrays. They come with some nice benefits though such as GPU acceleration which we’ll get to later. For now, use the generated data to calculate the output of this simple single layer network.

Exercise: Calculate the output of the network with input features features, weights weights, and bias bias. Similar to Numpy, PyTorch has a torch.sum() function, as well as a .sum() method on tensors, for taking sums. Use the function activation defined above as the activation function.

Calculate the output of this network using the weights and bias tensors

You can do the multiplication and sum in the same operation using a matrix multiplication. In general, you’ll want to use matrix multiplications since they are more efficient and accelerated using modern libraries and high-performance computing on GPUs.

Here, we want to do a matrix multiplication of the features and the weights. For this we can use torch.mm() or torch.matmul() which is somewhat more complicated and supports broadcasting. If we try to do it with features and weights as they are, we’ll get an error

As you’re building neural networks in any framework, you’ll see this often. Really often. What’s happening here is our tensors aren’t the correct shapes to perform a matrix multiplication. Remember that for matrix multiplications, the number of columns in the first tensor must equal to the number of rows in the second column. Both features and weights have the same shape, (1, 5). This means we need to change the shape of weights to get the matrix multiplication to work.

Note: To see the shape of a tensor called tensor, use tensor.shape. If you’re building neural networks, you’ll be using this method often.

There are a few options here: weights.reshape()weights.resize_(), and weights.view().

  • weights.reshape(a, b) will return a new tensor with the same data as weights with size (a, b) sometimes, and sometimes a clone, as in it copies the data to another part of memory.
  • weights.resize_(a, b) returns the same tensor with a different shape. However, if the new shape results in fewer elements than the original tensor, some elements will be removed from the tensor (but not from memory). If the new shape results in more elements than the original tensor, new elements will be uninitialized in memory. Here I should note that the underscore at the end of the method denotes that this method is performed in-place. Here is a great forum thread to read more about in-place operations in PyTorch.
  • weights.view(a, b) will return a new tensor with the same data as weights with size (a, b).

I usually use .view(), but any of the three methods will work for this. So, now we can reshape weights to have five rows and one column with something like weights.view(5, 1).

Exercise: Calculate the output of our little network using matrix multiplication.

Calculate the output of this network using matrix multiplication

Stack them up!

That’s how you can calculate the output for a single neuron. The real power of this algorithm happens when you start stacking these individual units into layers and stacks of layers, into a network of neurons. The output of one layer of neurons becomes the input for the next layer. With multiple input units and output units, we now need to express the weights as a matrix.

The first layer shown on the bottom here are the inputs, understandably called the input layer. The middle layer is called the hidden layer, and the final layer (on the right) is the output layer. We can express this network mathematically with matrices again and use matrix multiplication to get linear combinations for each unit in one operation. For example, the hidden layer ($h_1$ and $h_2$ here) can be calculated

$\vec{h} = [h_1 \, h_2] = \begin{bmatrix} x_1 \, x_2 \cdots \, x_n \end{bmatrix} \cdot \begin{bmatrix} w_{11} & w_{12} \\ w_{21} &w_{22} \\ \vdots &\vdots \\ w_{n1} &w_{n2} \end{bmatrix}$

The output for this small network is found by treating the hidden layer as inputs for the output unit. The network output is expressed simply

$y = f_2 \! \left(\, f_1 \! \left(\vec{x} \, \mathbf{W_1}\right) \mathbf{W_2} \right)$

Exercise: Calculate the output for this multi-layer network using the weights W1 & W2, and the biases, B1 & B2.

If you did this correctly, you should see the output tensor([[ 0.3171]]).

The number of hidden units a parameter of the network, often called a hyperparameter to differentiate it from the weights and biases parameters. As you’ll see later when we discuss training a neural network, the more hidden units a network has, and the more layers, the better able it is to learn from data and make accurate predictions.

Numpy to Torch and back

Special bonus section! PyTorch has a great feature for converting between Numpy arrays and Torch tensors. To create a tensor from a Numpy array, use torch.from_numpy(). To convert a tensor to a Numpy array, use the .numpy() method.

import numpy as np
np.set_printoptions(precision=8)
a = np.random.rand(4,3)
a

torch.set_printoptions(precision=8)
b = torch.from_numpy(a)
b

b.numpy()

# The memory is shared between the Numpy array and Torch tensor, 
# so if you change the values in-place of one object, the other will change as well.

# Multiply PyTorch Tensor by 2, in place
b.mul_(2)

# Numpy array matches new values from Tensor
a
%d 블로거가 이것을 좋아합니다: