0% found this document useful (0 votes)

5 views57 pages

Module 3 DL

The document provides an overview of autoencoders, a type of artificial neural network used for unsupervised learning and dimensionality reduction. It covers various types of autoencoders, including linear, undercomplete, and regularized versions, as well as their architectures, training methods, and applications such as image compression and denoising. Key concepts include the encoder-decoder structure, bottleneck representation, and the importance of hyperparameters in training autoencoders.

Uploaded by

jinaypatel0504

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views57 pages

Module 3 DL

Uploaded by

jinaypatel0504

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Deep Learning (AIC701)

Module 3-Autoencoder

Devanand K. Bathe
Asst. Professor
3.1 Introduction,
• Linear Autoencoder,
• Undercomplete Autoencoder,
• Overcomplete Autoencoders,
• Regularization in Autoencoders

3.2 :
• Denoising Autoencoders,
• Sparse Autoencoders,
• Contractive Autoencoders.

3.3 :
• Application of Autoencoders: Image Compression
Introduction:
• An autoencoder is a type of artificial neural network used to learn
efficient data codings in an unsupervised manner.
• The goal of an autoencoder is to-
• learn a representation for a set of data, usually for dimensionality reduction
by training the network to ignore signal noise.
• Along with the reduction side, a reconstructing side is also learned, where the
autoencoder tries to generate from the reduced encoding a representation as
close as possible to its original input.
• This helps autoencoders to learn important features present in the
data.
• When a representation allows a good reconstruction of its input then
it has retained much of the information present in the input.
• Recently, the autoencoder concept has become more widely used for
learning generative models of data.
• Autoencoders are a specific type of feedforward neural networks
where the input is the same as the output.
• They compress the input into a lower-dimensional code and then
reconstruct the output from this representation.
• The code is a compact “summary” or “compression” of the input, also
called the latent-space representation.
• An autoencoder consists of 3 components: encoder, code and
decoder. The encoder compresses the input and produces the code,
the decoder then reconstructs the input only using this code.
• Autoencoders are mainly a dimensionality reduction (or compression)
algorithm with a couple of important properties:
• Data-specific: Autoencoders are only able to meaningfully compress data
similar to what they have been trained on. Since they learn features specific
for the given training data, they are different than a standard data
compression algorithm like gzip. So we can’t expect an autoencoder trained
on handwritten digits to compress landscape photos.
• Lossy: The output of the autoencoder will not be exactly the same as the
input, it will be a close but degraded representation. If you want lossless
compression they are not the way to go.
• Unsupervised: To train an autoencoder we don’t need to do anything fancy,
just throw the raw input data at it. Autoencoders are considered an
unsupervised learning technique since they don’t need explicit labels to
train on. But to be more precise they are self-supervised because they
generate their own labels from the training data.
Architecture of autoencoder:
Let’s start with a quick overview of autoencoders’ architecture.
• Autoencoders consist of 3 parts:
1. Encoder: A module that compresses the train-validate-test set input data into an encoded
representation that is typically several orders of magnitude smaller than the input data.
2. Bottleneck: A module that contains the compressed knowledge representations and is therefore the
most important part of the network.
3. Decoder: A module that helps the network“decompress” the knowledge representations and
reconstructs the data back from its encoded form. The output is then compared with a ground truth.
• The architecture as a whole looks something like this:
Encoder
• The encoder is a set of convolutional blocks followed by pooling modules that compress
the input to the model into a compact section called the bottleneck.
• The bottleneck is followed by the decoder that consists of a series of upsampling modules
to bring the compressed feature back into the form of an image.
• In case of simple autoencoders, the output is expected to be the same as the input data
with reduced noise.
• However, for variational autoencoders it is a completely new image, formed with
information the model has been provided as input.

Bottleneck
• The most important part of the neural network, and ironically the smallest one, is the
bottleneck.
• The bottleneck exists to restrict the flow of information to the decoder from the encoder,
thus,allowing only the most vital information to pass through.
• Since the bottleneck is designed in such a way that the maximum information possessed
by an image is captured in it, we can say that the bottleneck helps us form a
knowledge-representation of the input.
• Thus, the encoder-decoder structure helps us extract the most from an image in the form
of data and establish useful correlations between various inputs within the network.
• A bottleneck as a compressed representation of the input further prevents the neural
network from memorising the input and overfitting on the data.
• As a rule of thumb, remember this: The smaller the bottleneck, the lower the risk of
overfitting.
• However, Very small bottlenecks would restrict the amount of information storable, which
increases the chances of important information slipping out through the pooling layers of
the encoder.

Decoder
• Finally, the decoder is a set of upsampling and convolutional blocks that reconstructs the
bottleneck's output.

• Since the input to the decoder is a compressed knowledge representation, the decoder
serves as a “decompressor” and builds back the image from its latent attributes.
• Properties and Hyperparameters
• Data-specific: Autoencoders are only able to compress data similar to
what they have been trained on.
• Lossy: The decompressed outputs will be degraded compared to the
original inputs.
• Learned automatically from examples: It is easy to train specialized
instances of the algorithm that will perform well on a specific type of
input.
• Hyperparameters of Autoencoders:
• There are 4 hyperparameters that we need to set before training an
autoencoder:
• Code size: It represents the number of nodes in the middle layer.
Smaller size results in more compression.
• Number of layers: The autoencoder can consist of as many layers as
we want.
• Number of nodes per layer: The number of nodes per layer decreases
with each subsequent layer of the encoder, and increases back in the
decoder. The decoder is symmetric to the encoder in terms of the
layer structure.
• Loss function: We either use mean squared error or binary
cross-entropy. If the input values are in the range [0, 1] then we
typically use cross-entropy, otherwise, we use the mean squared
error.
• Similar to the standard feedforward neural network with a key difference:
• Unsupervised. No label at the output layer; Output layer simply tries to recreate the
input

• Defined by two (possibly nonlinear) mapping functions: Encoding function f ,

Decoding function g
• h = f (x) denotes an encoding (possibly nonlinear) for the input x
• ^x = g(h) = g(f (x)) denotes the reconstruction (or the decoding) for the input x
• For an Autoencoder, f and g are learned with a goal to minimize the difference
between ^x and x
• The learned code h = f (x) can be used as a new feature representation of the
input x
• Therefore autoencoders can also be used for feature learning
• Note: Size of the hidden units (encoding) can also be larger than the input
• General structure of an autoencoder
• Maps an input x to an output r (called reconstruction) through
• an internal representation code h
• It has a hidden layer h that describes a code used to represent the input
• The network has two parts
• The encoder function
h=f(x)
• A decoder that produces a reconstruction
r=g(h)
Linear Autoencoder:
• A linear autoencoder and Principal Component Analysis (PCA) are similar in that both
methods aim to reduce the dimensionality of a dataset. However, there are some key
differences between the two.

• PCA is a linear dimensionality reduction technique that finds the directions (principal
components) of maximum variance in the data, and projects the data onto a
lower-dimensional subspace. It is unsupervised, which means it does not use any labeled
data.

• A linear autoencoder is an unsupervised neural network that aims to learn a

lower-dimensional representation of the data by training an encoder to compress the
input data and a decoder to reconstruct the original data from the compressed
representation. The encoder and decoder are both linear, which means they are
composed of fully-connected layers without non-linear activations.

• In summary, PCA is a linear technique that finds the directions of maximum variance in the
data, while a linear autoencoder is a neural network that learns a linear
encoding-decoding process to compress and reconstruct the data. Both methods can be
used for dimensionality reduction, but they have different assumptions and limitations.
• An autoencoder with linear transfer functions is equivalent to PCA
• Let’s prove the equivalence for the case of an autoencoder with just 1 hidden layer, the bottleneck
layer.
• First recall how PCA works:
• x-the original data, z-the reduced data and x^-the reconstructed data from the reduced
representation.
• Then we can write PCA as:
z=BTx
x^=Bz
• Now consider an autoenoder:
Consider the architecure in the picture
• Wich basically is x→z→x^
• Since we said that the activation functions are linear transfer
functions σ(x)=x then we can write the autoencoder as:
x^=W1W2x
• where W1 and W2 are the weights of the first and second layer.
• Now if we set W1=B and W2=BT
we have:
x^=W1(W2x)
x^=W1z
x^=Bz
• which is the same solution that we had for PCA.
Autoencoders are preferred over PCA because:
• An autoencoder can learn non-linear transformations with a
non-linear activation function and multiple layers.
• It doesn’t have to learn dense layers. It can use convolutional layers to
learn which is better for video, image and series data.
• It is more efficient to learn several layers with an autoencoder rather
than learn one huge transformation with PCA.
• An autoencoder provides a representation of each layer as the
output.
• It can make use of pre-trained layers from another model to apply
transfer learning to enhance the encoder/decoder.
Linear autoencoder:
Non-linear autoencoder:
How to train autoencoder:
• You need to set 4 hyperparameters before training an autoencoder:
• Code size: The code size or the size of the bottleneck is the most important
hyperparameter used to tune the autoencoder. The bottleneck size decides how
much the data has to be compressed. This can also act as a regularisation term.
• Number of layers: Like all neural networks, an important hyperparameter to tune
autoencoders is the depth of the encoder and the decoder. While a higher depth
increases model complexity, a lower depth is faster to process.
• Number of nodes per layer: The number of nodes per layer defines the weights we
use per layer. Typically, the number of nodes decreases with each subsequent
layer in the autoencoder as the input to each of these layers becomes smaller
across the layers.
• Reconstruction Loss: The loss function we use to train the autoencoder is highly
dependent on the type of input and output we want the autoencoder to adapt to.
If we are working with image data, the most popular loss functions for
reconstruction are MSE Loss and L1 Loss. In case the inputs and outputs are
within the range [0,1], we can also make use of Binary Cross Entropy as the
reconstruction loss.
Training the autoencoder:
Undercomplete, Overcomplete and need for
Regularization
Undercomplete Overcomplete
autoencoder autoencoder
Undercomplete autoencoder:
An undercomplete autoencoder is one of the simplest types of autoencoders.

• The way it works is very straightforward—

• Undercomplete autoencoder takes in an image and tries to predict the
same image as output, thus reconstructing the image from the compressed
bottleneck region.
• Undercomplete autoencoders are truly unsupervised as they do not take
any form of label, the target being the same as the input.
• The primary use of autoencoders like such is the generation of the latent
space or the bottleneck, which forms a compressed substitute of the input
data and can be easily decompressed back with the help of the network
when needed.
• This form of compression in the data can be modeled as a form of
dimensionality reduction.
• When we think of dimensionality reduction, we tend to think of methods like PCA
(Principal Component Analysis) that form a lower-dimensional hyperplane to represent
data in a higher-dimensional form without losing information.
• However—PCA can only build linear relationships. As a result, it is put at a disadvantage
compared with methods like undercomplete autoencoders that can learn non-linear
relationships and, therefore, perform better in dimensionality reduction.

• This form of nonlinear dimensionality reduction where the autoencoder learns a

non-linear manifold is also termed as manifold learning.

• Effectively, if we remove all non-linear activations from an undercomplete autoencoder

and use only linear layers, we reduce the undercomplete autoencoder into something that
works at an equal footing with PCA.

• The loss function used to train an undercomplete autoencoder is called reconstruction

loss, as it is a check of how well the image has been reconstructed from the input data.
Applications of Autoencoders
• Image Coloring
• Autoencoders are used for converting any black and white picture
into a colored image. Depending on what is in the picture, it is
possible to tell what the color should be.
• Feature variation: It extracts only the required features of an image and generates the output by
removing any noise or unnecessary interruption.

• Dimensionality Reduction:The reconstructed image is the same as our input but with reduced
dimensions. It helps in providing the similar image with a reduced pixel value.
• Denoising Image: The input seen by the autoencoder is not the raw input but a stochastically
corrupted version. A denoising autoencoder is thus trained to reconstruct the original input from
the noisy version.

• Watermark Removal: It is also used for removing watermarks from images or to remove any
object while filming a video or a movie.
Regularized autoencoder:
• Regularized autoencoders are useful to prevent the autoencoders to copy
the input features and learn the important characteristics as well.
• They are useful in the case when the autoencoders have the same input
and output dimension and in the case of over-complete autoencoders
• There are other ways we can constraint the reconstruction of an
autoencoder than to impose a hidden layer of smaller dimension than the
input.
• Rather than limiting the model capacity by keeping the encoder and
decoder shallow and the code size small, regularized autoencoders use a
loss function that encourages the model to have other properties besides
the ability to copy its input to its output.
• In practice, we usually find two types of regularized autoencoder: the
sparse autoencoder and the denoising autoencoder.
Sparse Autoencoder:
• A sparse autoencoder is simply an autoencoder whose training criterion
involves a sparsity penalty.
• In most cases, we would construct our loss function by penalizing
activations of hidden layers so that only a few nodes are encouraged to
activate when a single sample is fed into the network.
• The intuition behind this method is that, for example, if a man claims to be
an expert in mathematics, computer science, psychology, and classical
music, he might be just learning some quite shallow knowledge in these
subjects.
• However, if he only claims to be devoted to mathematics, we would like to
anticipate some useful insights from him. And it’s the same for
autoencoders we’re training — fewer nodes activating while still keeping its
performance would guarantee that the autoencoder is actually learning
latent representations instead of redundant information in our input data.
• There are actually two different ways to construct our sparsity
penalty: L1 regularization and KL-divergence.
• Why L1 Regularization Sparse
• L1 regularization and L2 regularization are widely used in machine
learning and deep learning. L1 regularization adds “absolute value of
magnitude” of coefficients as penalty term while L2 regularization
adds “squared magnitude” of coefficient as a penalty term.
• Although L1 and L2 can both be used as regularization term, the key
difference between them is that L1 regularization tends to shrink the
penalty coefficient to zero while L2 regularization would move
coefficients towards zero but they will never reach.
• Thus L1 regularization is often used as a method of feature extraction.
• Loss Function
• Finally, after the above analysis, we get the idea of using L1 regularization in
sparse autoencoder and the loss function is as below:

• Except for the first two terms, we add the third term which penalizes the absolute
value of the vector of activations a in layer h for sample i.
• Then we use a hyperparameter to control its effect on the whole loss function.
And in this way, we do build a sparse autoencoder.
• Due to the sparsity of L1 regularization, sparse autoencoder actually learns better
representations and its activations are more sparse which makes it perform
better than original autoencoder without L1 regularization.
• Sparse autoencoders have hidden nodes greater than input nodes.
They can still discover important features from the data.
• A generic sparse autoencoder is visualized where the obscurity of a
node corresponds with the level of activation.
• Sparsity constraint is introduced on the hidden layer. This is to
prevent output layer copy input data.
• Sparsity may be obtained by additional terms in the loss function
during the training process, either by comparing the probability
distribution of the hidden unit activations with some low desired
value,or by manually zeroing all but the strongest hidden unit
activations. Some of the most powerful AIs in the 2010s involved
sparse autoencoders stacked inside of deep neural networks.
• Advantages-
• Sparse autoencoders have a sparsity penalty, a value close to zero but
not exactly zero. Sparsity penalty is applied on the hidden layer in
addition to the reconstruction error. This prevents overfitting.
• They take the highest activation values in the hidden layer and zero
out the rest of the hidden nodes. This prevents autoencoders to use
all of the hidden nodes at a time and forcing only a reduced number
of hidden nodes to be used.
• Drawbacks-
• For it to be working, it's essential that the individual nodes of a
trained model which activate are data dependent, and that different
inputs will result in activations of different nodes through the
network.
Sparse Autoencoder:
Denoising Autoencoder:
• Autoencoders are Neural Networks which are commonly used for
feature selection and extraction.
• However, when there are more nodes in the hidden layer than there
are inputs, the Network is risking to learn the so-called “Identity
Function”, also called “Null Function”, meaning that the output equals
the input, marking the Autoencoder useless.
• Denoising Autoencoders solve this problem by corrupting the data on
purpose by randomly turning some of the input values to zero.
• In general, the percentage of input nodes which are being set to zero
is about 50%.
• Other sources suggest a lower count, such as 30%. It depends on the
amount of data and input nodes you have.
DAE architecture:
• When calculating the Loss function, it is important to compare the output values with
the original input, not with the corrupted input. That way, the risk of learning the
identity function instead of extracting features is eliminated.
• Denoising Autoencoders are an important and crucial tool for feature selection and
extraction
original corrupted reconstructed

Denoising autoencoders create a corrupted copy of the input by introducing some noise.
This helps to avoid the autoencoders to copy the input to the output without learning
features about the data.
These autoencoders take a partially corrupted input while training to recover the original
undistorted input. The model learns a vector field for mapping the input data towards a
lower dimensional manifold which describes the natural data to cancel out the added
noise.
• Advantages-
• It was introduced to achieve good representation. Such a representation is one
that can be obtained robustly from a corrupted input and that will be useful for
recovering the corresponding clean input.
• Corruption of the input can be done randomly by making some of the input as
zero. Remaining nodes copy the input to the noised input.
• Minimizes the loss function between the output node and the corrupted input.
• Setting up a single-thread denoising autoencoder is easy.

• Drawbacks-
• To train an autoencoder to denoise data, it is necessary to perform preliminary
stochastic mapping in order to corrupt the data and use as input.
• This model isn't able to develop a mapping which memorizes the training data
because our input and target output are no longer the same.
In denoising autoencoders, some noise is added to the input data and then the
model is trained to get the denoised version of the input data. The loss function
that is used in denoising autoencoders is:-
Loss = L(x, g(f(x’)))
where x’ is the input data with some noise and x is input data without noise.
Denoising Autoencoder:
Contractive autoencoder:
• Contractive autoencoder is an unsupervised deep learning technique
that helps a neural network encode unlabeled training data.
• A simple autoencoder is used to compress information of the given
data while keeping the reconstruction cost as low as possible.
Contractive autoencoder simply targets to learn invariant
representations to unimportant transformations for the given data.
• It only learns those transformations that are provided in the given
dataset so it makes the encoding process less sensitive to small
variations in its training dataset.
• The goal of Contractive Autoencoder is to reduce the representation’s
sensitivity towards the training input data.
• In order to achieve this, we must add a regularizer or penalty term to
the cost function that the autoencoder is trying to minimize.
• So from the mathematical point of view, it gives the effect of contraction by
adding an additional term to reconstruction cost and this term needs to comply
with the Frobenius norm of the Jacobian matrix to be applicable for the encoder
activation sequence.
• If this value is zero, it means that as we change input values, we don't observe
any change on the learned hidden representations.
• But if the value is very large, then the learned representation is unstable as the
input values change.
• Contractive autoencoders are usually deployed as just one of several other
autoencoder nodes, activating only when other encoding schemes fail to label a
data point.
• The objective of a contractive autoencoder is to have a robust learned representation which is less
sensitive to small variation in the data.
• Robustness of the representation for the data is done by applying a penalty term to the loss
function.
• Contractive autoencoder is another regularization technique just like sparse and denoising
autoencoders. However, this regularizer corresponds to the Frobenius norm of the Jacobian
matrix of the encoder activations with respect to the input.
• Frobenius norm of the Jacobian matrix for the hidden layer is calculated with respect to input and
it is basically the sum of square of all elements.

Advantages-
• Contractive autoencoder is a better choice than denoising autoencoder to learn useful feature
extraction.
• This model learns an encoding in which similar inputs have similar encodings. Hence, we're
forcing the model to learn how to contract a neighborhood of inputs into a smaller neighborhood
of outputs.
• Contradictive autoencoders are same as sparse autoencoders with a little
difference in penalty term. The loss function of contradictive autoencoders
is written as:-
Loss = L(x, f(g(x))) + d(h, x)
where h = f(x) and d(h,x) can be written as:-

d(h, x) = lambda * ||∂h/∂x||²

• This forces the model to learn a function that does not change much when
x changes slightly.
• There is a connection between the denoising autoencoder and the
contractive autoencoder:
• the denoising reconstruction error is equivalent to a contractive
penalty on the reconstruction function that maps x to r - g(f(x)).
• In other words, denoising autoencoders make the reconstruction
function resist small but finite sized perturbations of the input,
whereas contractive autoencoders make the feature extraction
function resist infinitesimal perturbations of the input.
• When using the Jacobian based contractive penalty to pretrain
features f(x) for use with a classiﬁer, the best classiﬁcation accuracy
usually results from applying the contractive penalty to f(x) rather
than to g(f(x)).
Contractive autoencoder:
• There are some important equations we need to know first before deriving
contractive autoencoder. Before going there, we'll touch base on the
Frobenius norm of the Jacobian matrix.

• The Frobenius norm, also called the Euclidean norm, is matrix norm of an
mxn matrix A defined as the square root of the sum of the absolute squares
of its elements.

• The Jacobian matrix is the matrix of all first-order partial derivatives of a

vector-valued function. So when the matrix is a square matrix, both the
matrix and its determinant are referred to as the Jacobian.

• Combining these two defintions gives us the meaning of Frobenius norm of

the Jacobian matrix.
• The loss function - where the penalty term, λ(J(x))^2, is the squared Frobenius norm of the
Jacobian matrix of partial derivatives associated with the encoder function and is defined in the
second line.

• where ф is sigmoid nonlinearity. To get the j-th hidden unit, we need to get the dot product of the
i-th feature and the corresponding weight. Then using chain rule and substituting our above
assumptions for Z and h we get:

• Our main objective is to calculate the norm, so we could simplify that in our implementation so
that we don’t need to construct the diagonal matrix:
THANK YOU

Understanding Autoencoders in Neural Networks
No ratings yet
Understanding Autoencoders in Neural Networks
79 pages
Understanding Autoencoders in Deep Learning
No ratings yet
Understanding Autoencoders in Deep Learning
62 pages
Understanding Autoencoders in Deep Learning
No ratings yet
Understanding Autoencoders in Deep Learning
15 pages
Understanding Autoencoders in Neural Networks
No ratings yet
Understanding Autoencoders in Neural Networks
16 pages
Autoencoders and Their Applications
No ratings yet
Autoencoders and Their Applications
25 pages
Understanding Autoencoders in ML
No ratings yet
Understanding Autoencoders in ML
42 pages
Understanding Autoencoders in Neural Networks
No ratings yet
Understanding Autoencoders in Neural Networks
51 pages
Recurrent Networks and Autoencoders
No ratings yet
Recurrent Networks and Autoencoders
53 pages
Understanding Autoencoders in Deep Learning
No ratings yet
Understanding Autoencoders in Deep Learning
248 pages
Understanding Autoencoders in Deep Learning
No ratings yet
Understanding Autoencoders in Deep Learning
39 pages
Understanding Autoencoders in Machine Learning
No ratings yet
Understanding Autoencoders in Machine Learning
39 pages
Autoencoders and Generative Models Overview
No ratings yet
Autoencoders and Generative Models Overview
19 pages
Auto Encoder s
No ratings yet
Auto Encoder s
57 pages
Autoencoders for Dimensionality Reduction
No ratings yet
Autoencoders for Dimensionality Reduction
103 pages
Understanding Autoencoders in Neural Networks
No ratings yet
Understanding Autoencoders in Neural Networks
27 pages
Understanding Autoencoders in ML
No ratings yet
Understanding Autoencoders in ML
11 pages
Understanding Autoencoders in Deep Learning
No ratings yet
Understanding Autoencoders in Deep Learning
40 pages
Understanding Autoencoders in Deep Learning
No ratings yet
Understanding Autoencoders in Deep Learning
35 pages
Understanding Autoencoder Architecture
No ratings yet
Understanding Autoencoder Architecture
36 pages
Understanding Autoencoders in Deep Learning
100% (1)
Understanding Autoencoders in Deep Learning
4 pages
Understanding Autoencoders in AI
No ratings yet
Understanding Autoencoders in AI
12 pages
Understanding Autoencoders in AI
No ratings yet
Understanding Autoencoders in AI
17 pages
DL M3 Tech
No ratings yet
DL M3 Tech
15 pages
Understanding Autoencoders in Depth
No ratings yet
Understanding Autoencoders in Depth
91 pages
Autoencoders and Regularization Techniques
No ratings yet
Autoencoders and Regularization Techniques
23 pages
Understanding Autoencoders in ML
No ratings yet
Understanding Autoencoders in ML
57 pages
DL U4
No ratings yet
DL U4
11 pages
Understanding Autoencoders: Types & Uses
No ratings yet
Understanding Autoencoders: Types & Uses
20 pages
Autoencoders and Generative Models Overview
No ratings yet
Autoencoders and Generative Models Overview
27 pages
Understanding Autoencoders in Deep Learning
No ratings yet
Understanding Autoencoders in Deep Learning
54 pages
Overview of Autoencoder Types and Uses
No ratings yet
Overview of Autoencoder Types and Uses
21 pages
Understanding Autoencoders in Deep Learning
No ratings yet
Understanding Autoencoders in Deep Learning
58 pages
Physics-Informed Neural Networks
No ratings yet
Physics-Informed Neural Networks
53 pages
Twitter Spam Detection with Autoencoders
No ratings yet
Twitter Spam Detection with Autoencoders
26 pages
Auto Encoder
No ratings yet
Auto Encoder
3 pages
Autoencoders: Overview and Applications
No ratings yet
Autoencoders: Overview and Applications
20 pages
Understanding Autoencoders in Deep Learning
No ratings yet
Understanding Autoencoders in Deep Learning
51 pages
Understanding Autoencoders: Functions & Uses
No ratings yet
Understanding Autoencoders: Functions & Uses
4 pages
Understanding Autoencoders in Deep Learning
No ratings yet
Understanding Autoencoders in Deep Learning
23 pages
Denoising Autoencoders Explained
No ratings yet
Denoising Autoencoders Explained
7 pages
Understanding Auto-Encoders in AI
No ratings yet
Understanding Auto-Encoders in AI
47 pages
Autoencoders: Compression & Denoising Techniques
No ratings yet
Autoencoders: Compression & Denoising Techniques
14 pages
Understanding Autoencoders in ML
No ratings yet
Understanding Autoencoders in ML
22 pages
Autoencoders: Applications and Architecture
No ratings yet
Autoencoders: Applications and Architecture
22 pages
Autoencoder Applications in Deep Learning
No ratings yet
Autoencoder Applications in Deep Learning
7 pages
Understanding Autoencoders in Neural Networks
No ratings yet
Understanding Autoencoders in Neural Networks
52 pages
Understanding Autoencoders in Deep Learning
No ratings yet
Understanding Autoencoders in Deep Learning
43 pages
Applications of Autoencoders in Anomaly Detection
No ratings yet
Applications of Autoencoders in Anomaly Detection
13 pages
Understanding Autoencoders in Deep Learning
No ratings yet
Understanding Autoencoders in Deep Learning
39 pages
Autoencoder Architecture and Hyperparameters
No ratings yet
Autoencoder Architecture and Hyperparameters
16 pages
DL 2
No ratings yet
DL 2
30 pages
Understanding Undercomplete Autoencoders
No ratings yet
Understanding Undercomplete Autoencoders
32 pages
Dl Lecture 18 Autoencoders
No ratings yet
Dl Lecture 18 Autoencoders
40 pages
Overview of Undercomplete Autoencoders
No ratings yet
Overview of Undercomplete Autoencoders
20 pages
Unsupervised Learning: PCA & Autoencoders
No ratings yet
Unsupervised Learning: PCA & Autoencoders
80 pages
Understanding Autoencoders in ML
No ratings yet
Understanding Autoencoders in ML
65 pages
Autoencoders for Representation Learning
No ratings yet
Autoencoders for Representation Learning
18 pages
Understanding Autoencoders in Deep Learning
No ratings yet
Understanding Autoencoders in Deep Learning
23 pages
Global Success 3 Syllabus Overview
No ratings yet
Global Success 3 Syllabus Overview
21 pages
Digital Health Literacy in Indonesia
No ratings yet
Digital Health Literacy in Indonesia
8 pages
Panjab University Exam Form Details
No ratings yet
Panjab University Exam Form Details
5 pages
Pain Disability Index Overview and Scoring
No ratings yet
Pain Disability Index Overview and Scoring
3 pages
Class IX Mid Term Results 2023
No ratings yet
Class IX Mid Term Results 2023
1 page
C++ to WebAssembly: Tools & Techniques
No ratings yet
C++ to WebAssembly: Tools & Techniques
8 pages
BS Electrical Engineering Curriculum 2018-2019
No ratings yet
BS Electrical Engineering Curriculum 2018-2019
3 pages
GJFGJ
No ratings yet
GJFGJ
37 pages
From Information To Meaning Confronting Challenges of The Twenty First Century
No ratings yet
From Information To Meaning Confronting Challenges of The Twenty First Century
8 pages
Understanding Social Sciences in Grade 11
85% (20)
Understanding Social Sciences in Grade 11
2 pages
School Performance in 2010 Licensure Exam
0% (1)
School Performance in 2010 Licensure Exam
3 pages
Evaluating Multi-Cloud for SAP Migration
No ratings yet
Evaluating Multi-Cloud for SAP Migration
5 pages
AI & Data Science Course Curriculum
No ratings yet
AI & Data Science Course Curriculum
10 pages
Alzheimer's Conference: Dementia Insights
No ratings yet
Alzheimer's Conference: Dementia Insights
2 pages
Multiculturalism: Pros and Cons Explained
No ratings yet
Multiculturalism: Pros and Cons Explained
8 pages
Class 5 Exam Structure and Marks
No ratings yet
Class 5 Exam Structure and Marks
4 pages
LSTM-CNN DDoS Detection in Cloud Systems
No ratings yet
LSTM-CNN DDoS Detection in Cloud Systems
8 pages
IIT Patna Ph.D. Admission 2025-26
No ratings yet
IIT Patna Ph.D. Admission 2025-26
22 pages
Methods for Establishing Reliability
No ratings yet
Methods for Establishing Reliability
11 pages
Career Paths in Hospitality & Tourism
No ratings yet
Career Paths in Hospitality & Tourism
2 pages
TOEFL Independent Speaking Strategies
No ratings yet
TOEFL Independent Speaking Strategies
20 pages
Endodontics Clinical Case Study
No ratings yet
Endodontics Clinical Case Study
23 pages
Nursing Interventions for Mental Health Clients
No ratings yet
Nursing Interventions for Mental Health Clients
40 pages
Greedy Algorithm in Power Systems
No ratings yet
Greedy Algorithm in Power Systems
11 pages
Opinion Essay Structure Guide
No ratings yet
Opinion Essay Structure Guide
11 pages
Quantum Computing in Drug Discovery
No ratings yet
Quantum Computing in Drug Discovery
6 pages
Governess Life in Jane Eyre
100% (1)
Governess Life in Jane Eyre
112 pages
Class 12 Biology Project Topics 2025-26
No ratings yet
Class 12 Biology Project Topics 2025-26
2 pages
New Student Admission System SRS
No ratings yet
New Student Admission System SRS
10 pages
Humorous Misadventures in Banking
No ratings yet
Humorous Misadventures in Banking
2 pages

Module 3 DL

Uploaded by

Module 3 DL

Uploaded by

Deep Learning (AIC701)

• Defined by two (possibly nonlinear) mapping functions: Encoding function f ,

• A linear autoencoder is an unsupervised neural network that aims to learn a

• The way it works is very straightforward—

• This form of nonlinear dimensionality reduction where the autoencoder learns a

• Effectively, if we remove all non-linear activations from an undercomplete autoencoder

• The loss function used to train an undercomplete autoencoder is called reconstruction

d(h, x) = lambda * ||∂h/∂x||²

• The Jacobian matrix is the matrix of all first-order partial derivatives of a

• Combining these two defintions gives us the meaning of Frobenius norm of

You might also like