Machine Learning
Pytorch Tutorial
TA : 曾元(Yuan Tseng)
2022.02.18
Outline
● Background: Prerequisites & What is Pytorch?
● Training & Testing Neural Networks in Pytorch
● Dataset & Dataloader
● Tensors
● [Link]: Models, Loss Functions
● [Link]: Optimization
● Save/load models
Prerequisites
● We assume you are already familiar with…
1. Python3
■ if-else, loop, function, file IO, class, ...
■ refs: link1, link2, link3
2. Deep Learning Basics
■ Prof. Lee’s 1st & 2nd lecture videos from last year
■ ref: link1, link2
Some knowledge of NumPy will also be useful!
What is PyTorch?
● An machine learning framework in Python.
● Two main features:
○ N-dimensional Tensor computation (like NumPy) on GPUs
○ Automatic differentiation for training deep neural networks
Training Neural Networks
Define Neural Optimization
Loss Function
Network Algorithm
Training
More info about the training process in last year's lecture video.
Training & Testing Neural Networks
Training Validation Testing
Guide for training/validation/testing can be found here.
Training & Testing Neural Networks - in Pytorch
Step 1.
[Link] &
Load Data [Link]
Training Validation Testing
Dataset & Dataloader
● Dataset: stores data samples and expected values
● Dataloader: groups data in batches, enables multiprocessing
● dataset = MyDataset(file)
● dataloader = DataLoader(dataset, batch_size, shuffle=True)
Training: True
Testing: False
More info about batches and shuffling here.
Dataset & Dataloader
from [Link] import Dataset, DataLoader
class MyDataset(Dataset):
def __init__(self, file):
[Link] = ... Read data & preprocess
def __getitem__(self, index):
return [Link][index] Returns one sample at a time
def __len__(self):
return len([Link]) Returns the size of the dataset
Dataset & Dataloader
dataset = MyDataset(file)
dataloader = DataLoader(dataset, batch_size=5, shuffle=False)
DataLoader
__getitem__(0) 0
__getitem__(1) 1
Dataset __getitem__(2) 2 batch_size
__getitem__(3) 3
__getitem__(4) 4
mini-batch
Tensors
● High-dimensional matrices (arrays)
1-D tensor 2-D tensor 3-D tensor
e.g. audio e.g. black&white e.g. RGB images
images
Tensors – Shape of Tensors
● Check with .shape
4
3
5
3
5 5
(5, ) (3, 5) (4, 5, 3)
dim 0 dim 0 dim 1 dim 0 dim 1 dim 2
Note: dim in PyTorch == axis in NumPy
Tensors – Creating Tensors
● Directly from data (list or [Link]) tensor([[1., -1.],
x = [Link]([[1, -1], [-1, 1]]) [-1., 1.]])
x = torch.from_numpy([Link]([[1, -1], [-1, 1]]))
● Tensor of constant zeros & ones tensor([[0., 0.],
[0., 0.]])
x = [Link]([2, 2])
x = [Link]([1, 2, 5]) tensor([[[1., 1., 1., 1., 1.],
shape [1., 1., 1., 1., 1.]]])
Tensors – Common Operations
Common arithmetic functions are supported, such as:
● Addition ● Summation
z = x + y y = [Link]()
● Subtraction ● Mean
z = x - y y = [Link]()
● Power
y = [Link](2)
Tensors – Common Operations
● Transpose: transpose two specified dimensions
>>> x = [Link]([2, 3])
2
>>> [Link]
3
[Link]([2, 3])
>>> x = [Link](0, 1)
>>> [Link] 3
[Link]([3, 2])
2
Tensors – Common Operations
● Squeeze: remove the specified dimension with length = 1
>>> x = [Link]([1, 2, 3])
>>> [Link] 1
3
2
[Link]([1, 2, 3])
>>> x = [Link](0)
(dim = 0)
>>> [Link] 2
[Link]([2, 3]) 3
Tensors – Common Operations
● Unsqueeze: expand a new dimension
>>> x = [Link]([2, 3]) 2
>>> [Link]
3
[Link]([2, 3])
>>> x = [Link](1) (dim = 1)
>>> [Link] 2
[Link]([2, 1, 3]) 3
1
Tensors – Common Operations
x 2
3
1
● Cat: concatenate multiple tensors
y 2
>>> x = [Link]([2, 1, 3])
3
3
>>> y = [Link]([2, 3, 3])
>>> z = [Link]([2, 2, 3]) z
2
>>> w = [Link]([x, y, z], dim=1) 3
2
>>> [Link]
w
[Link]([2, 6, 3]) 2
3
6
more operators: [Link]
Tensors – Data Type
● Using different data types for model and data will cause errors.
Data type dtype tensor
32-bit floating point [Link] [Link]
64-bit integer (signed) [Link] [Link]
see official documentation for more information on data types.
Tensors – PyTorch v.s. NumPy
● Similar attributes
PyTorch NumPy
[Link] [Link]
[Link] [Link]
see official documentation for more information on data types.
ref: [Link]
Tensors – PyTorch v.s. NumPy
● Many functions have the same names as well
PyTorch NumPy
[Link] / [Link] [Link]
[Link]() [Link]()
[Link](1) np.expand_dims(x, 1)
ref: [Link]
Tensors – Device
● Tensors & modules will be computed with CPU by default
Use .to() to move tensors to appropriate devices.
● CPU
x = [Link](‘cpu’)
● GPU
x = [Link](‘cuda’)
Tensors – Device (GPU)
● Check if your computer has NVIDIA GPU
[Link].is_available()
● Multiple GPUs: specify ‘cuda:0’, ‘cuda:1’, ‘cuda:2’, ...
● Why use GPUs?
○ Parallel computing with more cores for arithmetic calculations
○ See What is a GPU and do you need one in deep learning?
Tensors – Gradient Calculation
1 >>> x = [Link]([[1., 0.], [-1., 1.]], requires_grad=True)
2 >>> z = [Link](2).sum()
3 >>> [Link]()
4 >>> [Link]
1 2
tensor([[ 2., 0.],
[-2., 2.]])
3 4
See here to learn about gradient calculation.
Training & Testing Neural Networks – in Pytorch
Step 2.
[Link]
Load Data
Define Neural
Network
Loss Function Training Validation Testing
Optimization
Algorithm
[Link] – Network Layers
● Linear Layer (Fully-connected Layer)
[Link](in_features, out_features)
Input Tensor Output Tensor
[Link](32, 64)
* x 32 * x 64
can be any shape (but last dimension must be 32)
e.g. (10, 32), (10, 5, 32), (1, 1, 3, 32), ...
[Link] – Network Layers
● Linear Layer (Fully-connected Layer)
ref: last year's lecture video
[Link] – Neural Network Layers
● Linear Layer (Fully-connected Layer)
y1
x1
y2
x2
32 y3 64 W x x + b = y
x3 (64x32)
...
...
x32
y64
[Link] – Network Parameters
● Linear Layer (Fully-connected Layer)
>>> layer = [Link](32, 64)
>>> [Link]
[Link]([64, 32]) W x x + b = y
(64x32)
>>> [Link]
[Link]([64])
[Link] – Non-Linear Activation Functions
● Sigmoid Activation
[Link]()
● ReLU Activation
[Link]()
See here to learn about why we need activation functions.
[Link] – Build your own neural network
import [Link] as nn
class MyModel([Link]):
def __init__(self):
super(MyModel, self).__init__()
[Link] = [Link](
[Link](10, 32), Initialize your model & define layers
[Link](),
[Link](32, 1)
)
def forward(self, x):
Compute output of your NN
return [Link](x)
[Link] – Build your own neural network
import [Link] as nn import [Link] as nn
class MyModel([Link]): class MyModel([Link]):
def __init__(self): def __init__(self):
super(MyModel, self).__init__() super(MyModel, self).__init__()
[Link] = [Link]( self.layer1 = [Link](10, 32)
[Link](10, 32), self.layer2 = [Link](),
[Link](), = self.layer3 = [Link](32,1)
[Link](32, 1)
) def forward(self, x):
out = self.layer1(x)
def forward(self, x): out = self.layer2(out)
return [Link](x) out = self.layer3(out)
return out
Training & Testing Neural Networks – in Pytorch
Step 3.
[Link]
[Link] etc.
Load Data
Define Neural
Network
Loss Function Training Validation Testing
Optimization
Algorithm
[Link] – Loss Functions
● Mean Squared Error (for regression tasks)
criterion = [Link]()
● Cross Entropy (for classification tasks)
criterion = [Link]()
● loss = criterion(model_output, expected_value)
Training & Testing Neural Networks – in Pytorch
Step 4.
[Link]
Load Data
Define Neural
Network
Loss Function Training Validation Testing
Optimization
Algorithm
[Link]
● Gradient-based optimization algorithms that adjust network
parameters to reduce error. (See Adaptive Learning Rate lecture video)
● E.g. Stochastic Gradient Descent (SGD)
[Link]([Link](), lr, momentum = 0)
[Link]
optimizer = [Link]([Link](), lr, momentum = 0)
● For every batch of data:
1. Call optimizer.zero_grad() to reset gradients of model parameters.
2. Call [Link]() to backpropagate gradients of prediction loss.
3. Call [Link]() to adjust model parameters.
See official documentation for more optimization algorithms.
Training & Testing Neural Networks – in Pytorch
Load Data
Define Neural
Network
Loss Function Training Validation Testing
Optimization
Algorithm Step 5.
Entire Procedure
Neural Network Training Setup
dataset = MyDataset(file) read data via MyDataset
tr_set = DataLoader(dataset, 16, shuffle=True) put dataset into Dataloader
model = MyModel().to(device) construct model and move to device (cpu/cuda)
criterion = [Link]() set loss function
optimizer = [Link]([Link](), 0.1) set optimizer
Neural Network Training Loop
for epoch in range(n_epochs): iterate n_epochs
[Link]() set model to train mode
for x, y in tr_set: iterate through the dataloader
optimizer.zero_grad() set gradient to zero
x, y = [Link](device), [Link](device) move data to device (cpu/cuda)
pred = model(x) forward pass (compute output)
loss = criterion(pred, y) compute loss
[Link]() compute gradient (backpropagation)
[Link]() update model with optimizer
Neural Network Validation Loop
[Link]() set model to evaluation mode
total_loss = 0
for x, y in dv_set: iterate through the dataloader
x, y = [Link](device), [Link](device) move data to device (cpu/cuda)
with torch.no_grad(): disable gradient calculation
pred = model(x) forward pass (compute output)
loss = criterion(pred, y) compute loss
total_loss += [Link]().item() * len(x) accumulate loss
avg_loss = total_loss / len(dv_set.dataset) compute averaged loss
Neural Network Testing Loop
[Link]() set model to evaluation mode
preds = []
for x in tt_set: iterate through the dataloader
x = [Link](device) move data to device (cpu/cuda)
with torch.no_grad(): disable gradient calculation
pred = model(x) forward pass (compute output)
[Link]([Link]()) collect prediction
Notice - [Link](), torch.no_grad()
● [Link]()
Changes behaviour of some model layers, such as dropout and batch
normalization.
● with torch.no_grad()
Prevents calculations from being added into gradient computation
graph. Usually used to prevent accidental training on validation/testing
data.
Save/Load Trained Models
● Save
[Link](model.state_dict(), path)
● Load
ckpt = [Link](path)
model.load_state_dict(ckpt)
More About PyTorch
● torchaudio
○ speech/audio processing
● torchtext
○ natural language processing
● torchvision
○ computer vision
● skorch
○ scikit-learn + pyTorch
More About PyTorch
● Useful github repositories using PyTorch
○ Huggingface Transformers (transformer models: BERT, GPT, ...)
○ Fairseq (sequence modeling for NLP & speech)
○ ESPnet (speech recognition, translation, synthesis, ...)
○ Most implementations of recent deep learning papers
○ ...
References
● Machine Learning 2021 Spring Pytorch Tutorial
● Official Pytorch Tutorials
● [Link]
Any questions?