0% found this document useful (0 votes)

52 views34 pages

Sequence Learning with RNNs Explained

The document discusses sequence learning problems in deep learning, highlighting the differences between fixed-size inputs in feedforward networks and the variable-size, dependent inputs in sequence models. It introduces Recurrent Neural Networks (RNNs) as a solution to model such tasks, emphasizing parameter sharing and recurrent connections to account for dependencies between inputs. Various applications of RNNs, including sentiment analysis and language translation, are also mentioned, along with challenges like the vanishing and exploding gradient problems.

Uploaded by

roycetheebanedu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

52 views34 pages

Sequence Learning with RNNs Explained

Uploaded by

roycetheebanedu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

EC 9170

Deep Learning for Electrical &

Computer Engineers

Sequence Models

Faculty of Engineering, University of Jaffna

Sequence Learning Problems
• In feedforward and CNNs, the size of the input is always fixed.
• For example, we fed fixed-size (32 × 32) images to convolutional neural
networks for image classification.
• Further, each input to the network is independent of the previous or future
inputs

• For example, the computations, outputs and decisions for two successive
images are completely independent of each other.
• In many applications, the input is not of a fixed size.
• Further successive inputs may not be independent of each other.
Sequence Learning Problems
• For example, consider the task of auto-completion.
• Given the first character ‘d’ you want to predict the
next character ‘e’ and so on

• First, successive inputs are no longer independent

(while predicting ‘e’ , you would want to know what
the previous input was in addition to the current
input)

• Second, the length of the inputs and the number of predictions are not fixed
(for example, “learn”, “deep”, “machine” have different numbers of characters)

• Third, each network (orange-blue-green structure) is performing the same

task (input: character, output: character)
• These are known as sequence learning problems
Sequence Learning Problems
• Consider the task of predicting the part of speech tag
(noun, adverb, adjective verb) of each word in a
sentence.
• Once we see an adjective (social) we are almost sure
that the next word should be a noun (man)
• Thus the current output (noun) depends on the
current input as well as the previous input
• Further the size of the input is not fixed (sentences
could have arbitrary number of words)

• Notice that here, we are interested in producing an output at each time step
• Each network is performing the same task (input: word, output: tag)
• Sometimes, we may not be interested in producing an output at every stage
• Instead, we would look at the full sequence and then produce an output
Sequence Learning Problems

• For example, consider the task of predicting the polarity

of a movie review
• The prediction clearly does not depend only on the last
word but also on some words which appear before
• Here again we could think that the network is
performing the same task at each step (input : word,
output : +/−) but it’s just that we don’t care about
intermediate outputs
Sequence Learning Problems
• Sequences could be composed of anything (not just words)
• For example, a video could be treated as a sequence of images
• We may want to look at the entire sequence and detect the activity being
performed

How do we model such tasks involving sequences?

Main issues using ANN for sequence problem

• Variable size of input/output neuron

• Too much computation (text will be converted to vector
to feed to input neuron)
• No parameter sharing
• Dependencies between inputs
Recurrent Neural Networks

What is the function being executed at each time step?

• Since we want the same function to be executed at each timestep, we should share
the same network (i.e., same parameters at each timestep)
• This parameter sharing also ensures that the network becomes agnostic to the
length (size) of the input.
• Since we are simply going to compute the same function at each time step, the
number of timesteps doesn’t matter
• We just create multiple copies of the network and execute them at each timestep
Recurrent Neural Networks
How do we account for dependence between inputs?

• The function computed at each time step is now

different.

• The network is sensitive to the length of the

sequence
• For example, a sequence of length 10 will
require f1,...,f10, whereas a sequence of length
Is this method okay?
100 will require f1,...,f100. → No, it violates the other two items
on our wish list.
Recurrent Neural Networks
• The solution is to add a recurrent connection in the network.

si is the state of the network at timestep i

• The parameters are W, U, V, c, b, which are shared across timesteps

• The same network (and parameters) can be used to compute y1,y2,...,y10 or
y100
Recurrent Neural Networks

Generic Representation of RNN

Let us revisit the sequence learning problems that we saw earlier

We now have recurrent

connections between time steps
which account for dependence
between inputs.
How does RNN reduce complexity?
• Given function f: h’,y=f(h,x)
h and h’ are vectors with
the same dimension

y1 y2 y3

h0 f h1 f h2 f h3 ……

x1 x2 x3
No matter how long the input/output sequence is, we only need one function f.
If f’s are different, then it becomes a feedforward NN. This may be treated as
another compression from fully connected network.
Different types of RNN

Eg: Sentiment analysis, review

Eg: Music generation, poetry writing

Eg: speech tag Eg: Language translation

Deep RNN ……

…
z1 z2 z3

g0 f2 g1 f2 g2 f2 g3 ……
g’,z = f2(g,y)

y1 y2 y3

h0 f1 h1 f1 h2 f1 h3 ……

x1 x2 x3 h’,y = f1(h,x)
Bidirectional RNN
x1 x2 x3

g0 f2 g1 f2 g2 f2 g3

z,g = f2(g,x) z1 z2 z3

p=f3(y,z) f3 p1 f3 p2 f3 p3

y1 y2 y3

h0 f1 h1 f1 h2 f1 h3

y,h=f1(x,h) x1 x2 x3
Backpropagation through time
Backpropagation through time
Vanishing vs Exploding Gradient problem
Thank you!

Sequence Modeling with Recurrent Neural Networks
No ratings yet
Sequence Modeling with Recurrent Neural Networks
31 pages
Recurrent and Recursive Neural Networks
No ratings yet
Recurrent and Recursive Neural Networks
45 pages
RNNs and LSTM in Sequence Modeling
No ratings yet
RNNs and LSTM in Sequence Modeling
75 pages
Understanding Recurrent Neural Networks
No ratings yet
Understanding Recurrent Neural Networks
47 pages
RNN, LSTM, and GRU Overview
No ratings yet
RNN, LSTM, and GRU Overview
14 pages
Unfolding RNNs for Sequence Learning
No ratings yet
Unfolding RNNs for Sequence Learning
42 pages
RNN Overview and Applications in Deep Learning
No ratings yet
RNN Overview and Applications in Deep Learning
31 pages
Sequence Modeling with RNNs
No ratings yet
Sequence Modeling with RNNs
92 pages
Sequence Modeling with RNNs and LSTMs
No ratings yet
Sequence Modeling with RNNs and LSTMs
125 pages
Sequence-to-Sequence Learning Overview
No ratings yet
Sequence-to-Sequence Learning Overview
46 pages
Sequence Models in Deep Learning
No ratings yet
Sequence Models in Deep Learning
1 page
RNN Overview and Applications in Deep Learning
No ratings yet
RNN Overview and Applications in Deep Learning
31 pages
Introduction to Sequence Models
No ratings yet
Introduction to Sequence Models
73 pages
Understanding RNNs for Sequence Modeling
No ratings yet
Understanding RNNs for Sequence Modeling
72 pages
Understanding Recurrent Neural Networks
100% (1)
Understanding Recurrent Neural Networks
78 pages
RNNs: Unidirectional vs Bidirectional
No ratings yet
RNNs: Unidirectional vs Bidirectional
30 pages
RNNs in Sequence Model Applications
No ratings yet
RNNs in Sequence Model Applications
76 pages
Introduction to Recurrent Neural Networks
No ratings yet
Introduction to Recurrent Neural Networks
16 pages
Overview of Recurrent Neural Networks
No ratings yet
Overview of Recurrent Neural Networks
32 pages
Understanding Recurrent Neural Networks
No ratings yet
Understanding Recurrent Neural Networks
76 pages
Deep Learning with RNNs for Time-Series
No ratings yet
Deep Learning with RNNs for Time-Series
25 pages
6b. Recurrent Neural Networks
No ratings yet
6b. Recurrent Neural Networks
38 pages
MIT 6.S191 - Recurrent Neural Networks
No ratings yet
MIT 6.S191 - Recurrent Neural Networks
84 pages
Understanding Recurrent Neural Networks
No ratings yet
Understanding Recurrent Neural Networks
44 pages
RNNs for Time Series Prediction in Finance
100% (1)
RNNs for Time Series Prediction in Finance
35 pages
RNNs for Sequence Data Processing
No ratings yet
RNNs for Sequence Data Processing
33 pages
Introduction to Recurrent Neural Networks
No ratings yet
Introduction to Recurrent Neural Networks
53 pages
Large Language Models Overview
No ratings yet
Large Language Models Overview
55 pages
CNN, RNN, LSTM, and Attention Overview
No ratings yet
CNN, RNN, LSTM, and Attention Overview
86 pages
LSTM Overview and Applications
No ratings yet
LSTM Overview and Applications
72 pages
Understanding Recurrent Neural Networks
No ratings yet
Understanding Recurrent Neural Networks
68 pages
RNN3: Advanced Recurrent Neural Networks
No ratings yet
RNN3: Advanced Recurrent Neural Networks
16 pages
Unfolding RNN Computational Graphs
No ratings yet
Unfolding RNN Computational Graphs
44 pages
RNNs: LSTM and GRU Overview
No ratings yet
RNNs: LSTM and GRU Overview
45 pages
RNN Lab Instructions for 6.036 Spring 2021
No ratings yet
RNN Lab Instructions for 6.036 Spring 2021
10 pages
Understanding Recurrent Neural Networks
No ratings yet
Understanding Recurrent Neural Networks
83 pages
RNNs and Deep Learning in Life Sciences
No ratings yet
RNNs and Deep Learning in Life Sciences
97 pages
Sequence Models in Deep Learning
No ratings yet
Sequence Models in Deep Learning
85 pages
Understanding Recurrent Neural Networks
No ratings yet
Understanding Recurrent Neural Networks
30 pages
Unfolding RNNs and Backpropagation
No ratings yet
Unfolding RNNs and Backpropagation
121 pages
Understanding Recurrent Neural Networks
No ratings yet
Understanding Recurrent Neural Networks
17 pages
Understanding Recurrent Neural Networks
No ratings yet
Understanding Recurrent Neural Networks
47 pages
Understanding Recurrent Neural Networks
No ratings yet
Understanding Recurrent Neural Networks
9 pages
RNN Sequence Modeling Overview
No ratings yet
RNN Sequence Modeling Overview
71 pages
Applications of Recurrent Neural Networks
No ratings yet
Applications of Recurrent Neural Networks
10 pages
Understanding RNNs and ReNNs Basics
No ratings yet
Understanding RNNs and ReNNs Basics
32 pages
Understanding Recurrent Neural Networks
No ratings yet
Understanding Recurrent Neural Networks
190 pages
LSTM and RNNs in Sequence Modeling
No ratings yet
LSTM and RNNs in Sequence Modeling
27 pages
RNN Unrolling and Training Insights
No ratings yet
RNN Unrolling and Training Insights
60 pages
LSTMs vs RNNs: Long Sequence Memory
No ratings yet
LSTMs vs RNNs: Long Sequence Memory
3 pages
Understanding Recurrent Neural Networks
No ratings yet
Understanding Recurrent Neural Networks
8 pages
Sequence Models in Deep Learning
No ratings yet
Sequence Models in Deep Learning
22 pages
DL For Sequencial Data
No ratings yet
DL For Sequencial Data
36 pages
Understanding Recurrent Neural Networks
No ratings yet
Understanding Recurrent Neural Networks
25 pages
Understanding Recurrent Neural Networks
No ratings yet
Understanding Recurrent Neural Networks
42 pages
Understanding Recurrent Neural Networks
No ratings yet
Understanding Recurrent Neural Networks
10 pages
Understanding Recurrent Neural Networks
No ratings yet
Understanding Recurrent Neural Networks
54 pages
Understanding Recurrent Neural Networks
No ratings yet
Understanding Recurrent Neural Networks
8 pages
BAI701 Deep Learning Syllabus
No ratings yet
BAI701 Deep Learning Syllabus
1 page
Activation Functions in Deep Learning: A Comprehensive Survey and Benchmark
No ratings yet
Activation Functions in Deep Learning: A Comprehensive Survey and Benchmark
18 pages
Deep Learning Techniques and Applications
No ratings yet
Deep Learning Techniques and Applications
11 pages
Neural Network Representation Overview
No ratings yet
Neural Network Representation Overview
5 pages
Understanding Perceptrons and MLPs
No ratings yet
Understanding Perceptrons and MLPs
13 pages
Convolutional Neural Networks Overview
No ratings yet
Convolutional Neural Networks Overview
83 pages
CRNNs for Music Tagging Efficiency
No ratings yet
CRNNs for Music Tagging Efficiency
5 pages
Perceptron AND Gate Implementation
No ratings yet
Perceptron AND Gate Implementation
11 pages
History of Perceptron Models
No ratings yet
History of Perceptron Models
221 pages
Understanding RNNs, LSTMs, and GRUs
No ratings yet
Understanding RNNs, LSTMs, and GRUs
58 pages
RepViT: Efficient Lightweight CNNs for Mobile
No ratings yet
RepViT: Efficient Lightweight CNNs for Mobile
12 pages
AL3502 Deep Learning for Vision Lab
No ratings yet
AL3502 Deep Learning for Vision Lab
33 pages
Introduction to Deep Learning Concepts
No ratings yet
Introduction to Deep Learning Concepts
17 pages
Understanding Multilayer Perceptrons
No ratings yet
Understanding Multilayer Perceptrons
17 pages
Understanding Artificial Neural Networks
No ratings yet
Understanding Artificial Neural Networks
24 pages
GRU vs LSTM: Key Differences Explained
No ratings yet
GRU vs LSTM: Key Differences Explained
16 pages
Overview of Artificial Neural Networks
No ratings yet
Overview of Artificial Neural Networks
34 pages
Deep Learning Exam Questions and Topics
No ratings yet
Deep Learning Exam Questions and Topics
1 page
Deep Neural Networks Overview
No ratings yet
Deep Neural Networks Overview
18 pages
Short-Term BTC/USDT Price Forecasting
No ratings yet
Short-Term BTC/USDT Price Forecasting
5 pages
FNN Architecture and Activation Functions
No ratings yet
FNN Architecture and Activation Functions
9 pages
Introduction to Artificial Neural Networks
No ratings yet
Introduction to Artificial Neural Networks
31 pages
Understanding FC Neural Network Layers
No ratings yet
Understanding FC Neural Network Layers
20 pages
NN Mdu Previousyears
No ratings yet
NN Mdu Previousyears
10 pages
Neural Networks & Fuzzy Logic Course
No ratings yet
Neural Networks & Fuzzy Logic Course
8 pages
MLPRegressor Hyperparameter Tuning Guide
No ratings yet
MLPRegressor Hyperparameter Tuning Guide
15 pages
Diabetic Retinopathy Detection Review
No ratings yet
Diabetic Retinopathy Detection Review
2 pages
Architecture-Agnostic Masked Image Modeling
No ratings yet
Architecture-Agnostic Masked Image Modeling
19 pages
Contextual Transformer Networks for Vision
No ratings yet
Contextual Transformer Networks for Vision
11 pages
Overview of Artificial Neural Networks
No ratings yet
Overview of Artificial Neural Networks
6 pages

Sequence Learning with RNNs Explained

Uploaded by

Sequence Learning with RNNs Explained

Uploaded by

EC 9170

Deep Learning for Electrical &

Faculty of Engineering, University of Jaffna

• First, successive inputs are no longer independent

• Third, each network (orange-blue-green structure) is performing the same

• For example, consider the task of predicting the polarity

How do we model such tasks involving sequences?

• Variable size of input/output neuron

What is the function being executed at each time step?

• The function computed at each time step is now

• The network is sensitive to the length of the

si is the state of the network at timestep i

• The parameters are W, U, V, c, b, which are shared across timesteps

Generic Representation of RNN

We now have recurrent

Eg: Sentiment analysis, review

Eg: speech tag Eg: Language translation

You might also like