UNIT-VII
ext Books:
[1] Goodfellow, I., Bengio,Y., and Courville, A., Deep Learning, MIT Press, 2016..
[4] Matrix Computations, Golub, G.,H., and Van Loan,C.,F, JHU Press,2013.
[5] Neural Networks: A Classroom Approach, Satish Kumar, Tata McGraw-Hill Ed., 2004.
Abbreviations
📌 1.3. Forward and Backward Propagation
Forward Propagation: Computes output from input through layers.
Backward Propagation: Updates weights using Gradient Descent to minimize the loss
function.
7.4 Regularizations
📌 2.1. Need for Regularization
Regularization techniques are used to reduce overfitting by introducing a penalty on
complex models, encouraging simpler models with better generalization.
📌 2.2. Types of Regularization
L1 Regularization (Lasso): Adds a penalty proportional to the absolute value of
weights. Promotes sparsity.
L2 Regularization (Ridge): Adds a penalty proportional to the square of the weights.
Helps in weight decay.
Dropout: Randomly drops neurons during training, forcing the network to avoid over-
reliance on specific paths.
Early Stopping: Monitors validation loss and stops training when performance
degrades.
Data Augmentation: Introduces variability in training data (e.g., image rotations, flips)
to improve robustness.
📌 6.3. Applications
Natural Language Processing (NLP): Text generation, language translation, speech
recognition.
Time-Series Analysis: Stock prices, weather forecasting.
7.7 Convolution Neural Network,
📌 5.1. Purpose
CNNs are specifically designed for spatial data such as images and videos. They exploit
the spatial hierarchy of data through convolutions.
📌 5.2. Key Components
Convolutional Layers: Apply filters to extract features.
Pooling Layers: Reduce dimensionality (e.g., Max Pooling).
Fully Connected Layers: Make predictions using flattened feature maps.
Activation Functions: Commonly ReLU and Softmax.
📌 5.3. Use Cases
Image Classification: (e.g., ImageNet).
Object Detection: (e.g., YOLO, Faster R-CNN).
Image Segmentation: (e.g., U-Net).
3.8 Recurrent Neural Network,
📌 6.1. Definition
RNNs are suitable for sequence data, where previous outputs are used as inputs for the
next step, enabling memory of past information.
📌 6.2. Types of RNNs
Standard RNNs: Struggle with long-term dependencies due to vanishing gradient
issues.
Long Short-Term Memory (LSTM): Introduces gates (input, output, forget) to
maintain long-term memory.
Gated Recurrent Units (GRU): A simplified version of LSTM with fewer parameters.
📌 1.2. Architecture
Consists of an input layer, multiple hidden layers, and an output layer.
Activation Functions:
o ReLU (Rectified Linear Unit): Helps in faster convergence.
o Sigmoid/Tanh: Useful for probabilistic outputs but suffer from vanishing
gradient problems.
o Softmax: For multi-class classification in the output layer.