Deep Learning - 2.6 Implementation of Softmax Regression from Scratch

728x90

Now we will implement softmax regression from scratch

Initialize Model parameters
Defining the softmax operation
Defining the Model
Defining the loss funciton

1. Initialize Model Parameters

Each example in the raw dataset is a 28 × 28 image. We will flaten them into a vector of length 784 and treat each pixel location as just another feature.

Because our dataset has 10 classes, our network will have an output dimension of 10.

So out weights will constitute a 784 × 10 matrix and the biases will constitute a 1 × 10 row vector.

initialize our weights W with Gaussian noise and our biases to take the initial value 0.

2. Defining Softmax Operation

3. Defining the model

4. Defining the loss function

Instead of using for loop we can make cross-entrophy loss more efficiently.

5. Classification Accuracy

The result is a tensor containing entries of 0 (false) and 1 (true). Taking the sum yields the number of correct predictions.

y_hat.argmax(axis=1) 's result is [2,2]

and y is [0,2]

so the accuracy is 0.5

6. Traning

First, we define a function to train for one epoch.

def train_epoch_ch3(net, train_iter, loss, updater): 
    """The training loop defined in Chapter 3."""
    # Set the model to training mode

if isinstance(net, torch.nn.Module):
    net.train()
# Sum of training loss, sum of training accuracy, no. of examples
metric = Accumulator(3)
for X, y in train_iter:
    # Compute gradients and update parameters
    y_hat = net(X)
    l = loss(y_hat, y)
    if isinstance(updater, torch.optim.Optimizer):
        # Using PyTorch in-built optimizer & loss criterion
        updater.zero_grad()
        l.mean().backward()
        updater.step()
    else:
        # Using custom built optimizer & loss criterion
        l.sum().backward()
        updater(X.shape[0])
    metric.add(float(l.sum()), accuracy(y_hat, y), y.numel())
# Return training loss and training accuracy
return metric[0] / metric[2], metric[1] / metric[2]

The following training function then trains a model net on a training dataset accessed via train_iter for multiple epochs, which is specified by num_epochs.

def train_ch3(net, train_iter, test_iter, loss, num_epochs, updater):  #@save
    """Train a model (defined in Chapter 3)."""
    animator = Animator(xlabel='epoch', xlim=[1, num_epochs], ylim=[0.3, 0.9],
                        legend=['train loss', 'train acc', 'test acc'])
    for epoch in range(num_epochs):
        train_metrics = train_epoch_ch3(net, train_iter, loss, updater)
        test_acc = evaluate_accuracy(net, test_iter)
        animator.add(epoch + 1, train_metrics + (test_acc,))
    train_loss, train_acc = train_metrics
    assert train_loss < 0.5, train_loss
    assert train_acc <= 1 and train_acc > 0.7, train_acc
    assert test_acc <= 1 and test_acc > 0.7, test_acc

we use the minibatch stochastic gradient descent to optimize the loss function of the model with a learning rate 0.1.

lr = 0.1

def updater(batch_size):
    return d2l.sgd([W, b], lr, batch_size)

Now we train the model with 10 epochs.

num_epochs = 10
train_ch3(net, train_iter, test_iter, cross_entropy, num_epochs, updater)

7. Prediction

Now that training is complete, our model is ready to classify some images.

def predict_ch3(net, test_iter, n=6):  #@save
    """Predict labels (defined in Chapter 3)."""
    for X, y in test_iter:
break
    trues = d2l.get_fashion_mnist_labels(y)
    preds = d2l.get_fashion_mnist_labels(net(X).argmax(axis=1))
    titles = [true +'\n' + pred for true, pred in zip(trues, preds)]
    d2l.show_images(
        X[0:n].reshape((n, 28, 28)), 1, n, titles=titles[0:n])
predict_ch3(net, test_iter)

728x90

저작자표시 (새창열림)

'ComputerScience > Machine Learning' 카테고리의 다른 글

Deep Learning - 3.1 Multilayer perceptrons (0)	2022.09.15
Deep Learning - 2.7 Concise Implementation of Softmax Regression (0)	2022.09.01
Deep Learning - 2.5 The Image Classification Dataset (0)	2022.08.19
Deep Learning - 2.4 Softmax Regression (0)	2022.08.18
Deep Learning - 2.3 Concise Implementation of Linear Regression (0)	2022.08.16

jsdysw

Deep Learning - 2.6 Implementation of Softmax Regression from Scratch

1. Initialize Model Parameters

2. Defining Softmax Operation

3. Defining the model

4. Defining the loss function

5. Classification Accuracy

6. Traning

7. Prediction

'ComputerScience > Machine Learning' 카테고리의 다른 글

티스토리툴바

Deep Learning - 2.6 Implementation of Softmax Regression from Scratch

1. Initialize Model Parameters

2. Defining Softmax Operation

3. Defining the model

4. Defining the loss function

5. Classification Accuracy

6. Traning

7. Prediction

'ComputerScience > Machine Learning' 카테고리의 다른 글

'ComputerScience/Machine Learning' Related Articles

티스토리툴바