본문 바로가기

ComputerScience/Machine Learning

Deep Learning - 2.2 Linear Regression Implementation from Scratch

728x90

Let's implement linear regression only using tensors including the data pipeline, model, loss, minibatch stochastic gradient descent optimizer.

  • Generating Dataset
  • Reading the dataset (minibatches)
  • Initializing Model Parameters
  • Defining the Model
  • Defining the Loss Function
  • Defining the Optimization algorithm
  • Training

 

1. Generating Dataset

Our synthetic datset will be a matrix

Our synthetic labels will be assigned according to the following linear model with the noise term ε

We will assume that ε obeys a normal distribution with mean of 0.

 

2. Reading the dataset (minibatches)

Using only these dataset we will find out the true_w, true_b with our models.

Now make several minibatches

As we run the iteration, we obtain distinct minibatches successively until the entire dataset has been exhausted

While this iteration implemented above is good for didactic purpose but it is inefficient because we load all the data in memeory. so use built-in iterators implemented in a deep learning framework which is considerably more efficient.

 

3. Initializing Model parameters

initialize weights by sampling random numbers from a normal distribution with mean 0 and a standard deviation of 0.01, and setting the bias to 0.

 

4. Defining the Model

It's a simple linear regression model so we simply take the matrix-vector dot product of the input features X and the model weights w, and add the offset b to each example

 

5. Defining the loss function

Updating our model requires taking the gradient of our loss function, we ought to define the loss function. y_hat is the prediction and y is the label.

 

6. Defining the optimization algorithm

 At each step, using one minibatch randomly drawn from our dataset, we will estimate the gradient of the loss with respect to our parameters.

The size of the update step is determined by the learning rate lr. Because our loss is calculated as a sum over the minibatch of examples, we normalize our step size by the batch size (batch_size),

7. Training

In each iteration(epoch), we will grab a minibatch of training examples, and pass them through our model to obtain a set of predictions. After calculating the loss, we initiate the backwards pass through the network, storing the gradients with respect to each parameter. Finally, we will call the optimization algorithm sgd to update the model parameters.

In deep learing we are typically less conerned with finding a perfect answer more concerned with parameters that lead to highly accurate prediction.

728x90
반응형