0% found this document useful (0 votes)

18 views27 pages

Build Your First Deep Neural Network

The lecture covers the construction of a Deep Neural Network (DNN) using PyTorch, emphasizing the importance of depth and non-linearity in learning complex relationships. Key components include activation functions, particularly ReLU, and the architecture of a DNN, which consists of input, hidden, and output layers. The session also outlines the process of building, training, and evaluating a DNN with a practical example using the Fashion-MNIST dataset.

Uploaded by

Tahsin Nujum

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

18 views27 pages

Build Your First Deep Neural Network

Uploaded by

Tahsin Nujum

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

CSE 5111: Deep Learning

Lecture 4: Building Our First Deep Neural Network [Class 9,10]

Master of Science in Computer Science & Engineering
Department of Computer Science and Engineering
Comilla University
Instructor: Mahmudul Hasan, PhD
Reference Text: Deep Learning (GBC) - Chapter 6.1, 6.3
Slide 2: Recap: The Complete Puzzle So Far

What We've Learned

We now have all the mathematical pieces:
1. Gradient Descent: The optimization
algorithm that minimizes loss.
2. Backpropagation: The efficient
algorithm for calculating gradients.
3. PyTorch Autograd: Automates
backpropagation for us.
Today's Mission: Assemble these pieces to build our first Deep Neural Network
(DNN) and understand why depth is powerful.
Slide 3: The Limitation of Linear Models
Why Go Deep? The Need for Non-Linearity
Problem: Single-layer networks (like linear regression) can
only learn linear relationships.
Real-World Example: The XOR Problem
• Can you separate True/False with a single straight line?
• Input: (0,0) → Output: 0
• Input: (0,1) → Output: 1
• Input: (1,0) → Output: 1
• Input: (1,1) → Output: 0
Answer: No! This is a fundamental limitation of linear models.
Slide 4: The Solution: Adding Layers & Non-Linearity
Building Complexity Step by Step
Think of it like this:
• Layer 1: Creates simple decision boundaries (straight
lines)
• Layer 2: Combines these lines to create more complex
shapes
• Layer 3: Combines those shapes to create even more complex regions
Analogy: Building with LEGO
• Single layer = Basic bricks
• Multiple layers = Complex structures from simple bricks
• Activation functions = The connectors that hold everything together
Slide 5: Activation Functions: The "Spark" of Neural Networks
What Are Activation Functions?
Activation functions determine whether a neuron
should "fire" or not. They introduce non-linearity!
Without activation functions:
• Deep network = Just multiple linear
transformations
• Multiple layers = Equivalent to a single layer
With activation functions:
• Deep network = Can learn complex, non-linear
relationships
Slide 6: Popular Activation Functions
Meet the Activation Function Family
1. Sigmoid
• σ(x) = 1/(1 + e⁻ˣ)

• Range: (0, 1)

• Problem: Vanishing gradients, not zero-centered

2. Tanh
• tanh(x) = (eˣ - e⁻ˣ)/(eˣ + e⁻ˣ)

• Range: (-1, 1)

• Better than sigmoid (zero-centered)

3. ReLU (Rectified Linear Unit)

• ReLU(x) = max(0, x)

• Range: [0, ∞)

• Default choice for most networks

Slide 7: Why ReLU is the Default Choice
The ReLU Revolution
Advantages:
• Computationally simple: Just max(0, x)
• Avoids vanishing gradient: Gradient is either 0 or 1
• Sparsity: About 50% of neurons can be inactive
Disadvantage:
• Dying ReLU: If inputs are always negative, neuron never
activates
Solution variants:
• Leaky ReLU: max(0.01x, x) - small slope for negative values

• Parametric ReLU (PReLU): Learn the slope

Slide 8: Architecture of a Deep Neural Network
Anatomy of a DNN
Input Layer → Hidden Layer 1 → Hidden Layer 2 → ... → Output Layer
↑ ↑ ↑ ↑
Raw Data Simple More Complex Final Prediction
Features Features
Key Components:
• Input Layer: Size = number of features
• Hidden Layers: Where the magic happens
• Output Layer: Size = number of classes (classification) or 1 (regression)
• Connections: Fully connected = each neuron connects to all neurons in next layer
Slide 9: Real-World Example: Image Classification
Hands-On Example: Fashion-MNIST
The Dataset:
• 70,000 grayscale images
• 10 categories (T-shirt, trousers, pullover, dress, etc.)
• 28×28 pixels = 784 features per image
Our Goal: Build a DNN that can classify clothing items!
Why this is perfect for learning:
• Simple enough to train quickly
• Complex enough to need a real neural network
Slide 10: Building a DNN in PyTorch: Step 1 - Imports
Setting Up Our Toolkit
python
import torch
import [Link] as nn
import [Link] as optim
import torchvision
import [Link] as transforms
import [Link] as plt

# Check if GPU is available

device = [Link]("cuda" if [Link].is_available() else "cpu")
print(f"Using device: {device}")
Key Imports:
• [Link]: Neural network modules

• [Link]: Optimization algorithms

• torchvision: Computer vision datasets

Slide 11: Building a DNN in PyTorch: Step 2 - Define the Model
Creating Our Neural Network Class
class FashionDNN([Link]):
def __init__(self):
super(FashionDNN, self).__init__()
[Link] = [Link](
# Input: 784 features (28x28 pixels)
[Link](784, 128), # First hidden layer
[Link](), # Activation function
[Link](128, 64), # Second hidden layer
[Link](), # Activation function
[Link](64, 10) # Output: 10 classes
)

def forward(self, x):

# Flatten the image from 28x28 to 784
x = [Link]([Link](0), -1)
return [Link](x)

# Create model and move to GPU

model = FashionDNN().to(device)
print(model)
Slide 12: Understanding [Link]
What is [Link]?
[Link] is a container that chains layers together:
Input → Linear(784,128) → ReLU() → Linear(128,64) → ReLU() →
Linear(64,10) → Output
It's like a pipeline:
• Data flows through each layer in order
• Output of one layer becomes input to the next
• Makes code clean and readable
Alternative: You can also define each layer separately and connect them manually in
the forward method.
Slide 13: Building a DNN: Step 3 - Prepare the Data
Getting Our Data Ready
python
# Transform: convert images to tensors and normalize
transform = [Link]([
[Link](),
[Link]((0.5,), (0.5,))
])

# Download and load training data

trainset = [Link](
root='./data', train=True, download=True, transform=transform)
trainloader = [Link](
trainset, batch_size=64, shuffle=True)

# Download and load test data

testset = [Link](
root='./data', train=False, download=True, transform=transform)
testloader = [Link](
testset, batch_size=64, shuffle=False)
Why batch_size=64?
• Training with mini-batches is more efficient
• Provides more stable gradient estimates
• Common sizes: 32, 64, 128, 256

Why Batch Sizes Are Often Powers of 2 in Deep Learning

In machine learning, particularly deep learning, the batch size refers to the number of training
samples processed together in one iteration before updating the model's parameters.

• GPU Parallelism: Aligns with GPU core counts (e.g., 32) for efficient processing.
• Memory Alignment: Matches memory page sizes to reduce padding and waste.
• Matrix Optimization: Speeds up cuDNN matrix operations for multiples of 8.
Slide 14: Building a DNN: Step 4 - Loss Function & Optimizer
Choosing the Right Tools
# Loss function for classification
criterion = [Link]()

# Optimizer - Adam is usually a good default

optimizer = [Link]([Link](), lr=0.001)

# Learning rate scheduler (optional but helpful)

scheduler = optim.lr_scheduler.StepLR(optimizer, step_size=10,
gamma=0.1)

Why CrossEntropyLoss?
• Perfect for multi-class classification

• Combines softmax and negative log likelihood

• Handles probabilities nicely

Why Adam?
a) Adaptive Learning Rates: Adam adjusts learning rates for each parameter using estimates
of first and second moments, leading to faster convergence.
b) Efficiency with Large Datasets: It performs well with large-scale data and parameters,
requiring less memory than other optimizers.
c) Robust to Noisy Gradients: Adam handles noisy or sparse gradients effectively, making it
suitable for complex, non-convex problems.
d) Combines Momentum and RMSProp: By integrating momentum and RMSProp, Adam
balances speed and stability in optimization.
Slide 15: The Complete Training Loop
Putting It All Together: Training
python
def train_model(model, trainloader, criterion, optimizer, epochs=10):
[Link]() # Set model to training mode
train_losses = []

for epoch in range(epochs):

running_loss = 0.0
for images, labels in trainloader:
# Move data to GPU
images, labels = [Link](device), [Link](device)

# Zero the gradients

optimizer.zero_grad()

# Forward pass
outputs = model(images)
loss = criterion(outputs, labels)

# Backward pass and optimize

[Link]()
[Link]()

running_loss += [Link]()

epoch_loss = running_loss / len(trainloader)

train_losses.append(epoch_loss)
print(f'Epoch [{epoch+1}/{epochs}], Loss: {epoch_loss:.4f}')

return train_losses

# Train the model!

loss_history = train_model(model, trainloader, criterion, optimizer)
Slide 16: Understanding Model Training vs Evaluation Modes
[Link]() vs [Link]()
Training Mode ([Link]()):
• Enables dropout and batch normalization

• Tracks gradients for backpropagation

• Used during training

Evaluation Mode ([Link]()):

• Disables dropout and uses full network

• Uses running statistics for batch norm

• No gradient tracking (saves memory)

• Used during testing/validation

python
# For testing:
[Link]()
with torch.no_grad(): # No gradients needed for testing
test_outputs = model(test_images)

# Back to training:
[Link]()
Slide 17: Evaluating Our Model
How Good is Our Model?
python
def evaluate_model(model, testloader):
[Link]()
correct = 0
total = 0

with torch.no_grad(): # No gradients needed = faster!

for images, labels in testloader:
images, labels = [Link](device), [Link](device)
outputs = model(images)
_, predicted = [Link]([Link], 1)
total += [Link](0)
correct += (predicted == labels).sum().item()

accuracy = 100 * correct / total

print(f'Test Accuracy: {accuracy:.2f}%')
return accuracy

# Test our trained model

accuracy = evaluate_model(model, testloader)
What's happening here:
• [Link](outputs, 1): Get the predicted class (highest probability)
• Compare predictions with true labels
• Calculate percentage of correct predictions
Slide 18: Visualizing Training Progress
Learning Curves: Our Training Report Card
python
[Link](loss_history)
[Link]('Training Loss Over Time')
[Link]('Epoch')
[Link]('Loss')
[Link](True)
[Link]()
What to look for:
• Good: Smooth, steady decrease in loss
• Bad: Loss oscillating wildly (learning rate too high)
• Bad: Loss not decreasing (learning rate too low or model too simple)
• Bad: Loss suddenly becomes NaN (exploding gradients)
Typical results: You should see loss drop from ~2.0 to ~0.3 in 10 epochs!
Slide 19: Making Predictions
Using Our Trained Model
# Get a batch of test images
dataiter = iter(testloader)
images, labels = next(dataiter)
images, labels = [Link](device), [Link](device)

# Make predictions
[Link]()
with torch.no_grad():
outputs = model(images)
_, predictions = [Link](outputs, 1)

# Display results
class_names = ['T-shirt', 'Trouser', 'Pullover', 'Dress', 'Coat',
'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot']

for i in range(5): # Show first 5 predictions

print(f'Predicted: {class_names[predictions[i]]}, '
f'Actual: {class_names[labels[i]]}')
Slide 20: Lab 2 Preview: Your Turn!
Lab 2: Implement and Train a DNN
Your Tasks:
1. Implement the DNN architecture we just built
2. Experiment with different architectures:
o Try different numbers of hidden layers

o Try different numbers of neurons per layer

o Try different activation functions

3. Tune hyperparameters:
o Learning rate (try 0.1, 0.01, 0.001)

o Batch size (try 32, 64, 128)

4. Achieve at least 85% test accuracy

Due Date: [Insert your due date here]
Slide 21: Key Takeaways
What We Learned Today
1. Why Depth Matters: Deep networks can learn complex, non-linear relationships
2. Activation Functions: ReLU is the default choice for hidden layers
3. DNN Architecture: Input → Hidden Layers → Output
4. PyTorch Workflow:
o Define model as [Link] subclass
o Use [Link] for simple architectures

o Choose appropriate loss function and optimizer

o Implement training loop

o Evaluate on test data

You now have everything needed to build and train real neural networks!
Slide 22: What's Next?
Preview of Lecture 5

• Problem: Our DNN treats images as flat

vectors - it ignores spatial structure!
• Solution: Convolutional Neural
Networks (CNNs)
• Topics:
o Convolution operation
o Pooling layers
o Building CNNs in PyTorch
o Transfer learning
• Reading: GBC, Chapter 9
Slide 23: References & Questions
References & Resources
1. Primary: GBC, Chapter 6.1, 6.3
2. PyTorch Tutorials: "Deep Learning with PyTorch: A 60 Minute Blitz"
3. Dataset: Fashion-MNIST documentation
4. Visualization: TensorBoard for tracking experiments

PyTorch Neural Network Guide for Beginners
No ratings yet
PyTorch Neural Network Guide for Beginners
17 pages
Deep Learning Fundamentals Explained
No ratings yet
Deep Learning Fundamentals Explained
8 pages
Esrgan PDF
No ratings yet
Esrgan PDF
14 pages
Deep Learning Activation Functions
No ratings yet
Deep Learning Activation Functions
26 pages
Deep Learning with ANNs in Python
No ratings yet
Deep Learning with ANNs in Python
30 pages
Training Deep Neural Networks in PyTorch
No ratings yet
Training Deep Neural Networks in PyTorch
25 pages
PyTorch Neural Networks Guide
No ratings yet
PyTorch Neural Networks Guide
4 pages
PyTorch Deep Learning Basics Guide
No ratings yet
PyTorch Deep Learning Basics Guide
8 pages
Fundamentals of Deep Learning Overview
No ratings yet
Fundamentals of Deep Learning Overview
195 pages
ESRGAN
No ratings yet
ESRGAN
21 pages
Train Your First Neural Network in PyTorch
No ratings yet
Train Your First Neural Network in PyTorch
68 pages
CNN Training Process Explained
No ratings yet
CNN Training Process Explained
49 pages
CNN Training and Optimization Techniques
No ratings yet
CNN Training and Optimization Techniques
49 pages
PyTorch Neural Network Cheat-Sheet
No ratings yet
PyTorch Neural Network Cheat-Sheet
7 pages
Deep Learning Fundamentals Explained
No ratings yet
Deep Learning Fundamentals Explained
12 pages
Speed Up PyTorch Model Training
No ratings yet
Speed Up PyTorch Model Training
3 pages
PyTorch Training Pipeline Explained
No ratings yet
PyTorch Training Pipeline Explained
23 pages
Deep Learning Fundamentals Overview
No ratings yet
Deep Learning Fundamentals Overview
9 pages
Build CNN Models with PyTorch Tutorial
No ratings yet
Build CNN Models with PyTorch Tutorial
4 pages
CIFAR-10 Classification with MLP
No ratings yet
CIFAR-10 Classification with MLP
15 pages
Deep Learning
No ratings yet
Deep Learning
40 pages
Kaiming Initialization in PyTorch
No ratings yet
Kaiming Initialization in PyTorch
37 pages
PyTorch 101 for Deep Learning PhD
No ratings yet
PyTorch 101 for Deep Learning PhD
19 pages
PyTorch Neural Network Training Guide
No ratings yet
PyTorch Neural Network Training Guide
48 pages
PyTorch Deep Learning Course Overview
No ratings yet
PyTorch Deep Learning Course Overview
6 pages
Neural Network Layers and Probabilities
No ratings yet
Neural Network Layers and Probabilities
3 pages
PyTorch Guide for Data Science Basics
No ratings yet
PyTorch Guide for Data Science Basics
30 pages
Deep Learning Summary Notes
No ratings yet
Deep Learning Summary Notes
6 pages
09 Tensorflow101 Slide
No ratings yet
09 Tensorflow101 Slide
78 pages
Deep Learning Fundamentals with PyTorch
No ratings yet
Deep Learning Fundamentals with PyTorch
44 pages
Introduction to Deep Neural Networks
No ratings yet
Introduction to Deep Neural Networks
3 pages
PyTorch Autoencoder Tutorial
No ratings yet
PyTorch Autoencoder Tutorial
7 pages
Choosing Activation Functions in Neural Networks
No ratings yet
Choosing Activation Functions in Neural Networks
8 pages
Deep Learning Activation Functions
No ratings yet
Deep Learning Activation Functions
26 pages
Intro to PyTorch Tensors and Models
No ratings yet
Intro to PyTorch Tensors and Models
12 pages
Lab_Assgn_DL_1-Feedforward Neural Network.ipynb - Colab
No ratings yet
Lab_Assgn_DL_1-Feedforward Neural Network.ipynb - Colab
6 pages
PyTorch Deep Learning Tutorial Guide
No ratings yet
PyTorch Deep Learning Tutorial Guide
5 pages
PyTorch Crash Course Overview
No ratings yet
PyTorch Crash Course Overview
15 pages
Understanding Neural Networks Basics
No ratings yet
Understanding Neural Networks Basics
41 pages
Deep Learning with PyTorch Overview
No ratings yet
Deep Learning with PyTorch Overview
30 pages
Deep Learning Fundamentals by Hung-yi Lee
100% (1)
Deep Learning Fundamentals by Hung-yi Lee
179 pages
Intro to PyTorch for Deep Learning
No ratings yet
Intro to PyTorch for Deep Learning
7 pages
Deep Learning Fundamentals and Applications
No ratings yet
Deep Learning Fundamentals and Applications
6 pages
Deep Learning Basics by Romain Tavenard
No ratings yet
Deep Learning Basics by Romain Tavenard
49 pages
DL Lab Manual-02
No ratings yet
DL Lab Manual-02
9 pages
Deep Learning Fundamentals with PyTorch
No ratings yet
Deep Learning Fundamentals with PyTorch
108 pages
Convolutional Neural Networks Overview
No ratings yet
Convolutional Neural Networks Overview
15 pages
Lab_Assgn_DL_1-Feedforward Neural Network.ipynb -mad
No ratings yet
Lab_Assgn_DL_1-Feedforward Neural Network.ipynb -mad
7 pages
06 Neural Networks
No ratings yet
06 Neural Networks
64 pages
Beginner's Guide to PyTorch Basics
No ratings yet
Beginner's Guide to PyTorch Basics
35 pages
Deep Learning Training Essentials
No ratings yet
Deep Learning Training Essentials
23 pages
Lab_Assgn_DL_1-Feedforward Neural Network.ipynb - Colab
No ratings yet
Lab_Assgn_DL_1-Feedforward Neural Network.ipynb - Colab
7 pages
Neural Network Calculation Overview
No ratings yet
Neural Network Calculation Overview
12 pages
Deep Learning with PyTorch Basics
No ratings yet
Deep Learning with PyTorch Basics
39 pages
PyTorch Feedforward Neural Network Guide
No ratings yet
PyTorch Feedforward Neural Network Guide
13 pages
CNNs and PCA: Neural Networks Overview
No ratings yet
CNNs and PCA: Neural Networks Overview
65 pages
Deep Learning with PyTorch Lightning
No ratings yet
Deep Learning with PyTorch Lightning
786 pages
Introduction to PyTorch for ML Modeling
No ratings yet
Introduction to PyTorch for ML Modeling
45 pages
100 Essential Excel Shortcuts Guide
No ratings yet
100 Essential Excel Shortcuts Guide
2 pages
Eye-Tracking Error Reduction with SVM
No ratings yet
Eye-Tracking Error Reduction with SVM
8 pages
Naive Bayes Classifier Overview
No ratings yet
Naive Bayes Classifier Overview
54 pages
New Document
No ratings yet
New Document
10 pages
Brain Tumor Detection with Machine Learning
No ratings yet
Brain Tumor Detection with Machine Learning
30 pages
Understanding Algorithms and Pseudocode
No ratings yet
Understanding Algorithms and Pseudocode
14 pages
Secant Method C Code for Root Finding
No ratings yet
Secant Method C Code for Root Finding
3 pages
Contingency Ranking in Power Systems
No ratings yet
Contingency Ranking in Power Systems
8 pages
Enhancing Transparency in Neural Networks
No ratings yet
Enhancing Transparency in Neural Networks
8 pages
Neural Networks: True/False Quiz
No ratings yet
Neural Networks: True/False Quiz
13 pages
Sparse Autoencoder Learning Notes
No ratings yet
Sparse Autoencoder Learning Notes
19 pages
Deriving Backpropagation Errors
No ratings yet
Deriving Backpropagation Errors
7 pages
Glushchenko 2019
No ratings yet
Glushchenko 2019
8 pages
Advanced Ensemble Learning Techniques
No ratings yet
Advanced Ensemble Learning Techniques
33 pages
Machine Learning Course Overview
No ratings yet
Machine Learning Course Overview
7 pages
DNN Cheat Sheet for Neural Networks
No ratings yet
DNN Cheat Sheet for Neural Networks
5 pages
Learning XOR with Deep Networks
No ratings yet
Learning XOR with Deep Networks
25 pages
Use of Neural Nets For Dynamic Modeling and Control of Chemical Process Systems
No ratings yet
Use of Neural Nets For Dynamic Modeling and Control of Chemical Process Systems
10 pages
Pinball-Huber ELM for Power Load Forecasting
No ratings yet
Pinball-Huber ELM for Power Load Forecasting
16 pages
Hybrid EM Optimization via Machine Learning
No ratings yet
Hybrid EM Optimization via Machine Learning
9 pages
Neural Networks for Resistivity Inversion
No ratings yet
Neural Networks for Resistivity Inversion
14 pages
Coffee Roast Classification via Image Processing
No ratings yet
Coffee Roast Classification via Image Processing
3 pages
Applications of AI in Various Sectors
No ratings yet
Applications of AI in Various Sectors
80 pages
Authorship Analysis:: Athira U
No ratings yet
Authorship Analysis:: Athira U
22 pages
Step-by-Step Backpropagation Guide
No ratings yet
Step-by-Step Backpropagation Guide
10 pages
Deep Learning Basics Overview
No ratings yet
Deep Learning Basics Overview
29 pages
Multi-layer Perceptron Overview
No ratings yet
Multi-layer Perceptron Overview
12 pages
Imputation for Dementia Diagnosis Using LSSVM
No ratings yet
Imputation for Dementia Diagnosis Using LSSVM
6 pages
Data Science & Visualization Syllabus
No ratings yet
Data Science & Visualization Syllabus
26 pages
Multi-layer Perceptron Overview
No ratings yet
Multi-layer Perceptron Overview
48 pages
An Intelligent Re Ective Colour Sensor System For Paper and Textile Industries
No ratings yet
An Intelligent Re Ective Colour Sensor System For Paper and Textile Industries
6 pages
BPNN for Gas Flow in Porous Media
No ratings yet
BPNN for Gas Flow in Porous Media
13 pages
B. Tech. Honours in Electronics Syllabus
No ratings yet
B. Tech. Honours in Electronics Syllabus
16 pages
Single Layer Perceptron Overview
No ratings yet
Single Layer Perceptron Overview
46 pages
Desalination Review Paper
No ratings yet
Desalination Review Paper
18 pages
Overview of Machine Learning Techniques
No ratings yet
Overview of Machine Learning Techniques
20 pages
Neural Networks: Training & Optimization Guide
No ratings yet
Neural Networks: Training & Optimization Guide
13 pages

Build Your First Deep Neural Network

Uploaded by

Build Your First Deep Neural Network

Uploaded by

CSE 5111: Deep Learning

Lecture 4: Building Our First Deep Neural Network [Class 9,10]

What We've Learned

• Problem: Vanishing gradients, not zero-centered

• Better than sigmoid (zero-centered)

3. ReLU (Rectified Linear Unit)

• Default choice for most networks

• Parametric ReLU (PReLU): Learn the slope

# Check if GPU is available

• [Link]: Optimization algorithms

• torchvision: Computer vision datasets

def forward(self, x):

# Create model and move to GPU

# Download and load training data

# Download and load test data

Why Batch Sizes Are Often Powers of 2 in Deep Learning

# Optimizer - Adam is usually a good default

# Learning rate scheduler (optional but helpful)

• Combines softmax and negative log likelihood

• Handles probabilities nicely

for epoch in range(epochs):

# Zero the gradients

# Backward pass and optimize

epoch_loss = running_loss / len(trainloader)

# Train the model!

• Tracks gradients for backpropagation

• Used during training

Evaluation Mode ([Link]()):

• Uses running statistics for batch norm

• No gradient tracking (saves memory)

• Used during testing/validation

with torch.no_grad(): # No gradients needed = faster!

accuracy = 100 * correct / total

# Test our trained model

for i in range(5): # Show first 5 predictions

o Try different numbers of neurons per layer

o Try different activation functions

o Batch size (try 32, 64, 128)

4. Achieve at least 85% test accuracy

o Choose appropriate loss function and optimizer

o Implement training loop

o Evaluate on test data

• Problem: Our DNN treats images as flat

You might also like