0% found this document useful (0 votes)
33 views21 pages

Representation Learning Overview

Uploaded by

Diptasri Gupta
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views21 pages

Representation Learning Overview

Uploaded by

Diptasri Gupta
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Representation Learning

and Latent Variable Models


ISOMAP: Embedding by Local Structure
• Construct neighborhood graph
k-nearest neighbor graph or 𝜺-neighborhood graph

• Compute shortest-path distances


Floyd-Warshall algorithm or Dijkstra

• MDS or Stress Majorization


Omit for time:
Already in our toolbox! Locally-linear embedding
(LLE)

Tenenbaum, de Silva, Langford.


“A Global Geometric Framework for Nonlinear Dimensionality Reduction.” Science (2000).
NLP: Popular Pretext Task
• Masked language modeling across large corpora helps follow-on tasks
• Some parts stripped before reusing, rest can be fine-tuned to new task
Today

“Swiss roll” “Two moons”

Can we learn useful features directly from the data itself?


...and use them for generative modeling?
Focus: Latent Variable Models
Big idea:
An unknown (simple/low-dimensional) latent variable is controlling the generation of an observable variable.
Typical setting: Probabilistic model explaining how a dataset was generated.

Simple example:
Gaussian mixture models (GMMs)

[Link]
Modern Latent Variable Models

Today

Lillian Weng’s blog


Plan for Today
• Autoencoders
• (Slightly) new neural network architecture
• New loss function

• Variational autoencoders
• Autoencoders with some noise

• Alternatives
• Other latent variable models
Credit
Many images borrowed from
“Understanding Variational Autoencoders (VAEs)” (Rocca)
[Link]

Math simplified from


“An Introduction to Variational Autoencoders” (Kingma and Welling)
[Link]
Plan for Today
• Autoencoders
• (Slightly) new neural network architecture
• New loss function

• Variational autoencoders
• Autoencoders with some noise

• Alternatives
• Other latent variable models
Autoencoders
“Bottleneck”

On the board:
• PCA as a special case
• Latent dimension:
Effect

[Link]
Memorization in Autoencoders

[Link]
Latent Structure in Autoencoders

[Link]
Plan for Today
• Autoencoders
• (Slightly) new neural network architecture
• New loss function

• Variational autoencoders
• Autoencoders with some noise

• Alternatives
• Other latent variable models
Autoencoders for Sampling?

[Link]
VAE: Big Idea

Wiggle around in the latent space before reconstructing!

[Link]
Balance Two Terms

Averaged over 𝒙 from dataset


Probability Review
On the Board: Probabilistic Story

Rough outline:
• Decoder probabilistic model
• Maximum likelihood estimation
• ELBO bound
• “Reparameterization trick”
• Back where we started
Plan for Today
• Autoencoders
• (Slightly) new neural network architecture
• New loss function

• Variational autoencoders
• Autoencoders with some noise

• Alternatives
• Other latent variable models
Many Alternatives
Representation Learning
and Latent Variable Models

You might also like