Linear functions in TensorFlow
The most common operation in neural networks is calculating the linear combination of inputs, weights, and biases. As a reminder, we can write the output of the linear operation as
\mathbf{y=xW+b}
Here, $\mathbf{W}$ is a matrix of the weights connecting two layers. The output $\mathbf{y}$, the input $\mathbf{x}$, and the biases $\mathbf{b}$ are all vectors.
Weights and Bias in TensorFlow
The goal of training a neural network is to modify weights and biases to best predict the labels. In order to use weights and bias, you’ll need a Tensor that can be modified. This leaves out tf.placeholder()
and tf.constant()
, since those Tensors can’t be modified. This is where tf.Variable
class comes in.
tf.Variable()
x = tf.Variable(5)
The tf.Variable
class creates a tensor with an initial value that can be modified, much like a normal Python variable. This tensor stores its state in the session, so you must initialize the state of the tensor manually. You’ll use the tf.global_variables_initializer()
function to initialize the state of all the Variable tensors.
Initialization
init = tf.global_variables_initializer() with tf.Session() as sess: sess.run(init)
The tf.global_variables_initializer()
call returns an operation that will initialize all TensorFlow variables from the graph. You call the operation using a session to initialize all the variables as shown above. Using the tf.Variable
class allows us to change the weights and bias, but an initial value needs to be chosen.
Initializing the weights with random numbers from a normal distribution is good practice. Randomizing the weights helps the model from becoming stuck in the same place every time you train it. You’ll learn more about this in the next lesson, when you study gradient descent.
Similarly, choosing weights from a normal distribution prevents any one weight from overwhelming other weights. You’ll use the tf.truncated_normal()
function to generate random numbers from a normal distribution.
tf.truncated_normal()
n_features = 120 n_labels = 5 weights = tf.Variable(tf.truncated_normal((n_features, n_labels)))
The tf.truncated_normal()
function returns a tensor with random values from a normal distribution whose magnitude is no more than 2 standard deviations from the mean.
Since the weights are already helping prevent the model from getting stuck, you don’t need to randomize the bias. Let’s use the simplest solution, setting the bias to 0.
tf.zeros()
n_labels = 5 bias = tf.Variable(tf.zeros(n_labels))
The tf.zeros()
function returns a tensor with all zeros.
Linear Classifier Quiz
- Open quiz.py.
- Implement
get_weights
to return atf.Variable
of weights - Implement
get_biases
to return atf.Variable
of biases - Implement
xW + b
in thelinear
function
- Implement
- Open sandbox.py
- Initialize all weights
Since xW
in xW + b
is matrix multiplication, you have to use the tf.matmul()
function instead of tf.multiply()
. Don’t forget that order matters in matrix multiplication, so tf.matmul(a,b)
is not the same as tf.matmul(b,a)
.
sandbox.py
# Solution is available in the other "sandbox_solution.py" tab import tensorflow as tf from tensorflow.examples.tutorials.mnist import input_data from quiz import get_weights, get_biases, linear def mnist_features_labels(n_labels): """ Gets the first <n> labels from the MNIST dataset :param n_labels: Number of labels to use :return: Tuple of feature list and label list """ mnist_features = [] mnist_labels = [] mnist = input_data.read_data_sets('/datasets/ud730/mnist', one_hot=True) # In order to make quizzes run faster, we're only looking at 10000 images for mnist_feature, mnist_label in zip(*mnist.train.next_batch(10000)): # Add features and labels if it's for the first <n>th labels if mnist_label[:n_labels].any(): mnist_features.append(mnist_feature) mnist_labels.append(mnist_label[:n_labels]) return mnist_features, mnist_labels # Number of features (28*28 image is 784 features) n_features = 784 # Number of labels n_labels = 3 # Features and Labels features = tf.placeholder(tf.float32) labels = tf.placeholder(tf.float32) # Weights and Biases w = get_weights(n_features, n_labels) b = get_biases(n_labels) # Linear Function xW + b logits = linear(features, w, b) # Training data train_features, train_labels = mnist_features_labels(n_labels) with tf.Session() as session: # TODO: Initialize session variables # Softmax prediction = tf.nn.softmax(logits) # Cross entropy # This quantifies how far off the predictions were. # You'll learn more about this in future lessons. cross_entropy = -tf.reduce_sum(labels * tf.log(prediction), reduction_indices=1) # Training loss # You'll learn more about this in future lessons. loss = tf.reduce_mean(cross_entropy) # Rate at which the weights are changed # You'll learn more about this in future lessons. learning_rate = 0.08 # Gradient Descent # This is the method used to train the model # You'll learn more about this in future lessons. optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(loss) # Run optimizer and get loss _, l = session.run( [optimizer, loss], feed_dict={features: train_features, labels: train_labels}) # Print loss print('Loss: {}'.format(l))
quiz.py
# Solution is available in the other "quiz_solution.py" tab import tensorflow as tf def get_weights(n_features, n_labels): """ Return TensorFlow weights :param n_features: Number of features :param n_labels: Number of labels :return: TensorFlow weights """ # TODO: Return weights pass def get_biases(n_labels): """ Return TensorFlow bias :param n_labels: Number of labels :return: TensorFlow bias """ # TODO: Return biases pass def linear(input, w, b): """ Return linear function in TensorFlow :param input: TensorFlow input :param w: TensorFlow weights :param b: TensorFlow biases :return: TensorFlow linear function """ # TODO: Linear Function (xW + b) pass
sandbox_solution.py
import tensorflow as tf # Sandbox Solution # Note: You can't run code in this tab from tensorflow.examples.tutorials.mnist import input_data from quiz import get_weights, get_biases, linear def mnist_features_labels(n_labels): """ Gets the first <n> labels from the MNIST dataset :param n_labels: Number of labels to use :return: Tuple of feature list and label list """ mnist_features = [] mnist_labels = [] mnist = input_data.read_data_sets('/datasets/ud730/mnist', one_hot=True) # In order to make quizzes run faster, we're only looking at 10000 images for mnist_feature, mnist_label in zip(*mnist.train.next_batch(10000)): # Add features and labels if it's for the first <n>th labels if mnist_label[:n_labels].any(): mnist_features.append(mnist_feature) mnist_labels.append(mnist_label[:n_labels]) return mnist_features, mnist_labels # Number of features (28*28 image is 784 features) n_features = 784 # Number of labels n_labels = 3 # Features and Labels features = tf.placeholder(tf.float32) labels = tf.placeholder(tf.float32) # Weights and Biases w = get_weights(n_features, n_labels) b = get_biases(n_labels) # Linear Function xW + b logits = linear(features, w, b) # Training data train_features, train_labels = mnist_features_labels(n_labels) with tf.Session() as session: session.run(tf.global_variables_initializer()) # Softmax prediction = tf.nn.softmax(logits) # Cross entropy # This quantifies how far off the predictions were. # You'll learn more about this in future lessons. cross_entropy = -tf.reduce_sum(labels * tf.log(prediction), reduction_indices=1) # Training loss # You'll learn more about this in future lessons. loss = tf.reduce_mean(cross_entropy) # Rate at which the weights are changed # You'll learn more about this in future lessons. learning_rate = 0.08 # Gradient Descent # This is the method used to train the model # You'll learn more about this in future lessons. optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(loss) # Run optimizer and get loss _, l = session.run( [optimizer, loss], feed_dict={features: train_features, labels: train_labels}) # Print loss print('Loss: {}'.format(l))
quiz_solution.py
# Quiz Solution # Note: You can't run code in this tab import tensorflow as tf def get_weights(n_features, n_labels): """ Return TensorFlow weights :param n_features: Number of features :param n_labels: Number of labels :return: TensorFlow weights """ # TODO: Return weights return tf.Variable(tf.truncated_normal((n_features, n_labels))) def get_biases(n_labels): """ Return TensorFlow bias :param n_labels: Number of labels :return: TensorFlow bias """ # TODO: Return biases return tf.Variable(tf.zeros(n_labels)) def linear(input, w, b): """ Return linear function in TensorFlow :param input: TensorFlow input :param w: TensorFlow weights :param b: TensorFlow biases :return: TensorFlow linear function """ # TODO: Linear Function (xW + b) return tf.add(tf.matmul(input, w), b)
댓글을 달려면 로그인해야 합니다.