본문 바로가기

ComputerScience/Machine Learning

Deep Learning - 3.2~3.3 Implementation of Multilayer Perceptrons

728x90

This time we'll make deeper neural network which classify MNIST fashion dataset.

For the same classification problem, the implementation of an MLP is the same as that of softmax regression except for additional hidden layers with activation functions.

1. Model

Typically, we choose layer widths in powers of 2, which tend to be computationally efficient because of how memory is allocated and addressed in hardware.

Again, we will represent our parameters with several tensors. Note that for every layer, we must keep track of one weight matrix and one bias vector. As always, we allocate memory for the gra- dients of the loss with respect to these parameters.

net = nn.Sequential(nn.Flatten(),
                    nn.Linear(784, 256),
                    nn.ReLU(),
                    nn.Linear(256, 10))
def init_weights(m):
    if type(m) == nn.Linear:
        nn.init.normal_(m.weight, std=0.01)
net.apply(init_weights);

 

batch_size, lr, num_epochs = 256, 0.1, 10
loss = nn.CrossEntropyLoss(reduction='none')
trainer = torch.optim.SGD(net.parameters(), lr=lr)

Recall that nn.CrossEntropyLoss calculates softmax and cross-entropy loss

728x90
반응형