Short-Time Processing of Speech Signals

The document discusses short-time processing of speech signals, emphasizing the importance of windowing and time-frequency transforms like DFT and DCT. It outlines the step-by-step process for applying these transforms, their applications in speech processing, and compares DFT with DCT and MFCC. Additionally, it highlights the advantages and limitations of various cepstral coefficients used in speech analysis.

Uploaded by

itsragno

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views108 pages

Short-Time Processing of Speech Signals

Uploaded by

itsragno

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Short-time processing

of speech signals
Introduction
• We need to split the signal into shorter segments and
apply windowing functions
• extra steps to be followed when we want to modify the
signal.
• windowing
• time-frequency transform such as the DFT (optional)
• apply the desired processing
• inverse time-frequency transform (when DFT was
Introduction
• Time-frequency transforms such as the discrete Fourier
transform (DFT) and the discrete cosine transform
(DCT) are orthonormal and have well-known fast
algorithms for their inverses.
• The main challenge is thus the “reverse of windowing”,
whatever that might be.
• The direct approach of just multiplying with the inverse
of the windowing function has a problem.
Discrete Fourier transform
Discrete Fourier transform
Discrete Fourier transform
Discrete Fourier transform
DFT including sampling interval
DFT including sampling interval
DFT including sampling interval
Applying DFT for speech signals
Applying DFT for speech signals
Applying DFT for speech signals
Challenges in speech signals
Challenges in speech signals
Challenges in speech signals
Challenges in speech signals
The Step-by-Step Process in
Detail
The Step-by-Step Process in
Detail
The Step-by-Step Process in
Detail
The Step-by-Step Process in
Detail
The Step-by-Step Process in
Detail
The Step-by-Step Process in
Detail
Key Applications of DFT/STFT in
Speech Processing
Key Applications of DFT/STFT in
Speech Processing
Key Applications of DFT/STFT in
Speech Processing
Key Applications of DFT/STFT in
Speech Processing
Key Applications of DFT/STFT in
Speech Processing
Key Applications of DFT/STFT in
Speech Processing
Key Applications of DFT/STFT in
Speech Processing
Limitations and Considerations
Limitations and Considerations
Limitations and Considerations
Discrete Cosine Transform (DCT)
Mel-Frequency Cepstral Coefficients
(MFCC)
Key Application: Mel-Frequency
Cepstral Coefficients (MFCCs)
Key Application: Mel-Frequency
Cepstral Coefficients (MFCCs)
Key Application: Mel-Frequency
Cepstral Coefficients (MFCCs)
Mathematical Formulation
Differences Between DFT and
Aspect Discrete FourierDCT
Transform Discrete Cosine Transform
(DFT) (DCT)

Output Type Complex-valued coefficients Real-valued coefficients

Basis Functions Complex exponentials: e^(- Real cosines: cos(πk(2n+1)/(2N))

j2πkn/N)
Phase Information Preserves both magnitude Discards phase information
and phase
Energy Compaction Good for periodic signals Excellent for correlated signals

Boundary Implicitly periodic extension Implicitly even-symmetric

Conditions extension
Computational Higher (complex arithmetic) Lower (real arithmetic only)
Load
Primary Use in Spectral analysis, Feature extraction (MFCCs),
When to Use DCT
MFCC Pipeline Showing Both
Transforms
This Combination Works So Well
This Combination Works So Well
They Answer Different Questions
Step 2: Frequency Domain
Transformation
Step 2: Frequency Domain
Transformation
Role of DCT in MFCC
Role of DCT in MFCC
Role of DCT in MFCC
Visual Example of DCT
Compression
DCT Over DFT
DCT Over DFT
The Complete MFCC Feature
Vector
The Complete MFCC Feature
Vector
Practical Example
The DCT-MFCC Relationship
The DCT-MFCC Relationship
MFCC Example Problem
MFCC Example Problem
MFCC Example Problem
Step-by-Step Solution
Step-by-Step Solution
Step-by-Step Solution
Step-by-Step Solution
n (2n+1) Angle (rad) cos(angle) X[n] Product
0 1 π×1/12 = 0.2618 cos(0.2618) = 0.9659 2.1 2.028

1 3 π×3/12 = 0.7854 cos(0.7854) = 0.7071 3.4 2.404

2 5 π×5/12 = 1.3090 cos(1.3090) = 0.2588 4.2 1.087

3 7 π×7/12 = 1.8326 cos(1.8326) = - 3.8 -0.983

0.2588
4 9 π×9/12 = 2.3562 cos(2.3562) = - 2.9 -2.051
0.7071
5 11 π×11/12 = 2.8798 cos(2.8798) = - 1.5 -1.449
0.9659
Step-by-Step Solution
Step-by-Step Solution
Step-by-Step Solution
n (2n+1) Angle (rad) cos(angle) X[n] Product
0 1 2π×1/12 = 0.5236 cos(0.5236) = 0.8660 2.1 1.819
1 3 2π×3/12 = 1.5708 cos(1.5708) = 0.0000 3.4 0.000
2 5 2π×5/12 = 2.6180 cos(2.6180) = - 4.2 -3.637
0.8660
3 7 2π×7/12 = 3.6652 cos(3.6652) = - 3.8 -3.291
0.8660
4 9 2π×9/12 = 4.7124 cos(4.7124) = 0.0000 2.9 0.000

5 11 2π×11/12 = 5.7596 cos(5.7596) = 0.8660 1.5 1.299

Step-by-Step Solution
Linear prediction cepstral
coefficients (LPCC)
Linear prediction cepstral
coefficients (LPCC)
Linear prediction cepstral
coefficients (LPCC)
Linear prediction cepstral
coefficients (LPCC)
Linear prediction cepstral
coefficients (LPCC)
Linear prediction cepstral
coefficients (LPCC)
Detailed Example Problem
Detailed Example Problem
Detailed Example Problem
Detailed Example Problem
Detailed Example Problem
Detailed Example Problem
LPCC vs MFCC: Key Differences
Aspect LPCC MFCC

Basis Linear Prediction model Filter bank analysis

Domain Time-domain modeling Frequency-domain analysis

Model Type All-pole model No explicit model
Computational Lower (Levinson-Durbin) Higher (FFT + Filter banks)
Load
Formant Modeling Excellent Good
Noise Robustness Less robust More robust
Pitch Sensitivity More sensitive Less sensitive

Common Speech coding, synthesis Speech recognition

Applications
Advantages of LPCC
Limitations of LPCC
Choosing LPC Order (p)
Applications
Applications
Applications
Gammatone Frequency Cepstral
Coefficients (GFCC)
The Biological Inspiration: The
Cochlea
GFCC Computation Pipeline
GFCC Computation Pipeline
GFCC Computation Pipeline
GFCC Computation Pipeline
Complete GFCC Algorithm
Complete GFCC Algorithm
Complete GFCC Algorithm
Complete GFCC Algorithm
Complete GFCC Algorithm
GFCC Performs Better in Noise
GFCC Performs Better in Noise
GFCC Performs Better in Noise
Applications
Computational Complexity
GFCC Advantages
GFCC Advantages
GFCC Computation Pipeline
Filter Bank Differences
Aspect MFCC (Mel-filter GFCC (Gammatone
Bank) Filter Bank)
Filter Shape Triangular Gammatone
(asymmetric, rounded)
Biological Rough approximation Detailed cochlear model
Basis
Temporal Poor Excellent (models
Resolution impulse response)
Frequency Uniform on Mel-scale ERB-scale (more
Resolution accurate)
Phase Ignored Partially preserved
Information

Discrete Fourier Transform & FFT Overview
No ratings yet
Discrete Fourier Transform & FFT Overview
72 pages
Discrete-Time Fourier Transform Overview
No ratings yet
Discrete-Time Fourier Transform Overview
47 pages
Continuous-Time Signal Analysis
No ratings yet
Continuous-Time Signal Analysis
52 pages
DFT in Signal Processing Techniques
No ratings yet
DFT in Signal Processing Techniques
84 pages
Understanding Fourier Transform Basics
No ratings yet
Understanding Fourier Transform Basics
13 pages
FFT 2025: DFT vs FFT Explained
No ratings yet
FFT 2025: DFT vs FFT Explained
39 pages
Understanding Discrete Fourier Transform
100% (1)
Understanding Discrete Fourier Transform
12 pages
Understanding Discrete Fourier Transform
No ratings yet
Understanding Discrete Fourier Transform
7 pages
Discrete Fourier Transform Overview
No ratings yet
Discrete Fourier Transform Overview
34 pages
Digital Signal Processing Overview
No ratings yet
Digital Signal Processing Overview
24 pages
Introduction to Fast Fourier Transform
No ratings yet
Introduction to Fast Fourier Transform
12 pages
DFT in Digital Signal Processing
No ratings yet
DFT in Digital Signal Processing
6 pages
Understanding Fast Fourier Transform (FFT)
No ratings yet
Understanding Fast Fourier Transform (FFT)
27 pages
Discrete Fourier Transform Overview
No ratings yet
Discrete Fourier Transform Overview
21 pages
50 Years of FFT Algorithms and Applications.
No ratings yet
50 Years of FFT Algorithms and Applications.
34 pages
Discrete Fourier Transform Overview
No ratings yet
Discrete Fourier Transform Overview
121 pages
DFT and DTFT Analysis in Signal Processing
No ratings yet
DFT and DTFT Analysis in Signal Processing
72 pages
Discrete Fourier Transform Overview
No ratings yet
Discrete Fourier Transform Overview
17 pages
Discrete-Time Fourier Transform Explained
No ratings yet
Discrete-Time Fourier Transform Explained
8 pages
4702 Lab5
No ratings yet
4702 Lab5
10 pages
Understanding the Discrete Fourier Transform
No ratings yet
Understanding the Discrete Fourier Transform
31 pages
Understanding Fourier Transform Concepts
No ratings yet
Understanding Fourier Transform Concepts
74 pages
Frequency Analysis of Signals and Systems
No ratings yet
Frequency Analysis of Signals and Systems
40 pages
Discrete Fourier Transform Overview
No ratings yet
Discrete Fourier Transform Overview
90 pages
DFT and DSP Fundamentals by Aucouturier
No ratings yet
DFT and DSP Fundamentals by Aucouturier
36 pages
Discrete Fourier Transform in Biomedicine
No ratings yet
Discrete Fourier Transform in Biomedicine
23 pages
Discrete Time Fourier Transform Guide
No ratings yet
Discrete Time Fourier Transform Guide
20 pages
Discrete Fourier Transform Overview
No ratings yet
Discrete Fourier Transform Overview
27 pages
Fourier Series and Transform Overview
No ratings yet
Fourier Series and Transform Overview
18 pages
Understanding Discrete Fourier Transform
No ratings yet
Understanding Discrete Fourier Transform
16 pages
FFT and MATLAB Signal Processing Guide
No ratings yet
FFT and MATLAB Signal Processing Guide
26 pages
Speech Signal Processing Techniques
No ratings yet
Speech Signal Processing Techniques
73 pages
DFT Study Notes for AKTU Engineering
No ratings yet
DFT Study Notes for AKTU Engineering
24 pages
Solution
No ratings yet
Solution
79 pages
Discrete Fourier Transform Overview
No ratings yet
Discrete Fourier Transform Overview
19 pages
CH 11
No ratings yet
CH 11
20 pages
Real-Time Fast Discrete Fourier Transform
No ratings yet
Real-Time Fast Discrete Fourier Transform
6 pages
Discrete Fourier Transform Explained
No ratings yet
Discrete Fourier Transform Explained
34 pages
Discrete Fourier Transform Explained
No ratings yet
Discrete Fourier Transform Explained
23 pages
Discrete Fourier Transform Overview
No ratings yet
Discrete Fourier Transform Overview
10 pages
Certified Dementia Support Group Facilitator
No ratings yet
Certified Dementia Support Group Facilitator
2 pages
Global Success 3 Syllabus Overview
No ratings yet
Global Success 3 Syllabus Overview
21 pages
Endodontics Clinical Case Study
No ratings yet
Endodontics Clinical Case Study
23 pages
QEC v. A&M: Drag Show Ban Injunction
No ratings yet
QEC v. A&M: Drag Show Ban Injunction
29 pages
IIT Patna Ph.D. Admission 2025-26
No ratings yet
IIT Patna Ph.D. Admission 2025-26
22 pages
Class XII Physics Subjective Test
No ratings yet
Class XII Physics Subjective Test
6 pages
MapReduce Programming Overview
No ratings yet
MapReduce Programming Overview
4 pages
Simple Past Tense of Main Verbs
No ratings yet
Simple Past Tense of Main Verbs
38 pages
NEET 2026 Subject-Wise Study Planner
No ratings yet
NEET 2026 Subject-Wise Study Planner
4 pages
Engineering Seminar Overview
No ratings yet
Engineering Seminar Overview
5 pages
Project 1 Unit 2 Test Overview
No ratings yet
Project 1 Unit 2 Test Overview
2 pages
Quantum Mechanics and Hydrogen Emission
No ratings yet
Quantum Mechanics and Hydrogen Emission
5 pages
My Journey Learning English
No ratings yet
My Journey Learning English
4 pages
ALTIS MSc in Strength & Conditioning
No ratings yet
ALTIS MSc in Strength & Conditioning
15 pages
Project Planning and Monitoring Tool
No ratings yet
Project Planning and Monitoring Tool
13 pages
NCERT Class 6 Science: Scientific Method
No ratings yet
NCERT Class 6 Science: Scientific Method
1 page
Pythagorean Theorem Explained for 9th Grade
No ratings yet
Pythagorean Theorem Explained for 9th Grade
16 pages
Key Figures in Philippine Anthropology
No ratings yet
Key Figures in Philippine Anthropology
10 pages
Greedy Algorithm in Power Systems
No ratings yet
Greedy Algorithm in Power Systems
11 pages
Overview of Cloudify Orchestration
No ratings yet
Overview of Cloudify Orchestration
1 page
Hirani's Innovative Film Promotions
No ratings yet
Hirani's Innovative Film Promotions
5 pages
LOGICPO: NL to FOL Translation Enhancements
No ratings yet
LOGICPO: NL to FOL Translation Enhancements
20 pages
Angelo Jade Dayag's Resume Summary
No ratings yet
Angelo Jade Dayag's Resume Summary
4 pages
DSM 5 Diagnosing
100% (6)
DSM 5 Diagnosing
107 pages
Museum Space Planning Guide
No ratings yet
Museum Space Planning Guide
49 pages
Boosting Health and Mental Well-being
No ratings yet
Boosting Health and Mental Well-being
1 page
Intro To CNN
No ratings yet
Intro To CNN
17 pages
Geography's Ethical Dimensions Explained
No ratings yet
Geography's Ethical Dimensions Explained
106 pages
Allergy Electronic 4th Edition Overview
100% (27)
Allergy Electronic 4th Edition Overview
16 pages
College Code / Name: 8110 - Imayam College of Engineering Branch Code / Name: 205 - B.Tech. Information Technology
No ratings yet
College Code / Name: 8110 - Imayam College of Engineering Branch Code / Name: 205 - B.Tech. Information Technology
7 pages

Short-Time Processing of Speech Signals

Uploaded by

Short-Time Processing of Speech Signals

Uploaded by

Short-time processing

Output Type Complex-valued coefficients Real-valued coefficients

Basis Functions Complex exponentials: e^(- Real cosines: cos(πk(2n+1)/(2N))

Boundary Implicitly periodic extension Implicitly even-symmetric

1 3 π×3/12 = 0.7854 cos(0.7854) = 0.7071 3.4 2.404

2 5 π×5/12 = 1.3090 cos(1.3090) = 0.2588 4.2 1.087

3 7 π×7/12 = 1.8326 cos(1.8326) = - 3.8 -0.983

5 11 2π×11/12 = 5.7596 cos(5.7596) = 0.8660 1.5 1.299

Basis Linear Prediction model Filter bank analysis

Domain Time-domain modeling Frequency-domain analysis

Common Speech coding, synthesis Speech recognition

You might also like