[νμ΄ν μΉ] νμ΄ν μΉλ‘ CNN λͺ¨λΈμ ꡬνν΄λ³΄μ! (κΈ°μ΄νΈ + DataLoader μ¬μ©λ²)
MNIST λ°μ΄ν° - CNN μ€μ΅
μ€λμ MNIST λ°μ΄ν°λ‘ Convolutional Neural Network(μ΄ν CNN)μ ꡬννκ³ λλ €λ³΄λ μκ°μ κ°λλ‘ νκ² μ΅λλ€!
λ¨Όμ , CNNμ ν¬κ² μλμ κ°μ ꡬμ±μμλ‘ μ΄λ£¨μ΄μ Έ μμ΅λλ€.
- ν©μ±κ³± μ°μ°(Convolution) : μ΄λ―Έμ§μ νΉμ±μ μΆμΆνλ κ³μΈ΅
- λ§₯μ€νλ§(Max Pooling) : μΆμΆλ νΉμ± μ€ μ€μν μ 보λ§μ μΆμ½νμ¬ μ λ¬
- μμ μ°κ²° μ κ²½λ§(Fully Connected Network) : μΆμΆλ μ 보λ₯Ό κΈ°λ°μΌλ‘ μ΅μ’ μμΈ‘μ μννλ κ³μΈ΅

Import Library
1
2
3
4
5
6
7
8
9
10
11
12
import torch
import torch.nn as nn
import torch.optim as optim
import torch.nn.init as init
import torchvision.datasets as datasets
import torchvision.transforms as transforms
from torch.utils.data import DataLoader
import numpy as np
import matplotlib.pyplot as plt
Set Hyperparameter
1
2
3
batch_size = 100
learning_rate = 0.0002
num_epoch = 10
Load MNIST Data
1
2
mnist_train = datasets.MNIST(root="../Data/", train=True, transform=transforms.ToTensor(), download=True)
mnist_test = datasets.MNIST(root="../Data/", train=False, transform=transforms.ToTensor(), download=True)
Define Loaders
1
2
train_loader = DataLoader(mnist_train, batch_size=batch_size, shuffle=True, num_workers=2, drop_last=True)
test_loader = DataLoader(mnist_test, batch_size=batch_size, shuffle=False, num_workers=2, drop_last=True)
Define CNN(Base) Model
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
class CNN(nn.Module):
def __init__(self):
super(CNN, self).__init__()
self.layer = nn.Sequential(
nn.Conv2d(1, 16, 5),
nn.ReLU(),
nn.Conv2d(16, 32, 5),
nn.ReLU(),
nn.MaxPool2d(2, 2),
nn.Conv2d(32, 64, 5),
nn.ReLU(),
nn.MaxPool2d(2, 2)
)
self.fc_layer = nn.Sequential(
nn.Linear(64 * 3 * 3, 100),
nn.ReLU(),
nn.Linear(100, 10)
)
def forward(self, x):
out = self.layer(x)
out = out.view(batch_size, -1)
out = self.fc_layer(out)
return out
Define Device & Model
1
2
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
model = CNN().to(device)
Define Loss & Optimizer
1
2
loss_func = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)
ποΈ Train Model
μ΄μ νμ΅μ μμν΄λ³΄κ² μ΅λλ€. λ¨Όμ λͺ¨λΈμ νμ΅ λͺ¨λλ‘ μ€μ νκΈ° μν΄ model.train()μ νΈμΆν©λλ€.
π§ model.train()μ΄λ?
model.train()μ PyTorch λͺ¨λΈμ νμ΅ λͺ¨λ(training mode)λ‘ μ νν©λλ€.
μ΄λ Dropout, BatchNorm κ°μ νμ΅ μ€μλ§ νμ±νλλ λ μ΄μ΄λ₯Ό μ¬λ°λ₯΄κ² λμμν€κΈ° μν΄ νμμ μΌλ‘ νΈμΆν΄μΌ ν©λλ€. μλ₯Ό λ€μ΄:
- Dropoutμ νμ΅ μ μΌλΆ λ΄λ°μ 무μμλ‘ κΊΌμ κ³Όμ ν©μ λ°©μ§νμ§λ§,
- Batch Normalizationμ λ°°μΉμ ν΅κ³λ₯Ό μ¬μ©νμ¬ κ°μ€μΉλ₯Ό μ κ·νν©λλ€.
model.train()μ νΈμΆνμ§ μμΌλ©΄ μ΄λ¬ν νμ΅ νΉν κΈ°λ₯μ΄ κΊΌμ§ μ±λ‘ νμ΅μ΄ μ§νλκΈ° λλ¬Έμ λͺ¨λΈμ μ±λ₯μ΄ νμ ν μ νλ μ μμ΅λλ€.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
loss_arr = []
for i in range(num_epoch):
model.train() # νμ΅ λͺ¨λλ‘ μ ν
for j, [image, label] in enumerate(train_loader):
x = image.to(device)
y = label.to(device)
optimizer.zero_grad()
output = model(x)
loss = loss_func(output, y)
loss.backward()
optimizer.step()
if j % 1000 == 0:
print(f"Epoch {i+1}, Step {j}: Loss = {loss.item():.4f}")
loss_arr.append(loss.cpu().detach().numpy())
π§ͺ Test Model
νμ΅μ΄ μλ£λ λͺ¨λΈμ λ°νμΌλ‘ ν μ€νΈ λ°μ΄ν°λ₯Ό μ λ ₯νμ¬ μ νλλ₯Ό νκ°ν΄λ΄ λλ€. μ΄λλ λ€μ λ κ°μ§ μ€μ μ λ°λμ μ μ©ν΄μΌ ν©λλ€.
1οΈβ£ model.eval()μ΄λ?
1
model.eval()
- λͺ¨λΈμ νκ° λͺ¨λ(Evaluation Mode)λ‘ μ νν©λλ€.
- Dropout, BatchNorm λ±μ λ μ΄μ΄κ° νμ΅ μμλ λ€λ₯΄κ² μλνλλ‘ μ€μ λ©λλ€.
- μμΈ‘ μμλ λͺ¨λ λ΄λ°μ νμ©νκ³ , BatchNormμ μ μ₯λ νκ· κ³Ό λΆμ°μ μ¬μ©ν©λλ€.
μ¦, νμ΅κ³Ό μΆλ‘ μ λͺ¨λκ° λ€λ₯΄κΈ° λλ¬Έμ, νκ° μ μ λ°λμ model.eval()μ νΈμΆν΄μΌ μ νν μ±λ₯ νκ°κ° κ°λ₯ν©λλ€.
2οΈβ£ with torch.no_grad()λ?
1
with torch.no_grad():
- Pytorchμ Autograd μμ§μ κΊΌμ gradient κ³μ°μ νμ§ μλλ‘ μ€μ ν©λλ€.
- ν μ€νΈλ μΆλ‘ μμλ κΈ°μΈκΈ° κ³μ°μ΄ νμ μκΈ° λλ¬Έμ λ©λͺ¨λ¦¬μ μλ μΈ‘λ©΄μμ ν¨μ¨μ μ λλ€.
- λν, GPU λ©λͺ¨λ¦¬λ₯Ό μ μ½νκ³ μ°μ° μλλ₯Ό λμΌ μ μμ΅λλ€.
β μ 체 ν μ€νΈ μ½λ
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
correct = 0
total = 0
model.eval() # νκ° λͺ¨λλ‘ μ ν
with torch.no_grad(): # gradient λΉνμ±ν
for image, label in test_loader:
x = image.to(device)
y = label.to(device)
output = model(x)
_, output_index = torch.max(output, 1)
total += label.size(0)
correct += (output_index == y).sum().float()
print("Accuracy of Test Data: {:.2f}%".format(100 * correct / total))
λ§λ¬΄λ¦¬ π
μ΄λ² ν¬μ€νΈμμλ MNIST λ°μ΄ν°μ μ νμ©ν΄ CNN λͺ¨λΈμ ꡬμ±νκ³ νμ΅λΆν° ν μ€νΈκΉμ§ μ κ³Όμ μ μ§νν΄λ³΄μμ΅λλ€.
νΉν, PyTorchμμ λͺ¨λΈμ νμ΅κ³Ό νκ° μμ μ λ°λΌ λ°λμ νΈμΆν΄μΌ νλ model.train(), model.eval(), torch.no_grad()μ μλ―Έμ μν μ λͺ
νν μ΄ν΄νλ κ²μ΄ λ§€μ° μ€μν©λλ€.
μ΄λ¬ν κΈ°λ³Έμ μΈ νλ¦μ μ μ΅νλλ©΄, ν₯ν 볡μ‘ν λͺ¨λΈμμλ ν¨μ¬ ν¨μ¨μ μΌλ‘ μ€νμ μ§νν μ μκ² λ©λλ€ π
κΆκΈνμ μ μ΄ μλ€λ©΄ λκΈλ‘ λ¨κ²¨μ£ΌμΈμ π
κΈ΄ κΈ μ½μ΄μ£Όμ μ κ°μ¬ν©λλ€!