0% found this document useful (0 votes)
8 views26 pages

FFmpeg

FFmpeg is an open-source multimedia framework that allows users to process, convert, stream, record, and edit audio and video files via command line. It supports a wide range of media operations including format conversion, audio extraction, video trimming, and compression, making it a versatile tool for developers. FFmpeg is widely utilized in various applications and platforms, including YouTube and Netflix, for tasks like transcoding, thumbnail generation, and adaptive bitrate streaming.

Uploaded by

Prince Allen
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views26 pages

FFmpeg

FFmpeg is an open-source multimedia framework that allows users to process, convert, stream, record, and edit audio and video files via command line. It supports a wide range of media operations including format conversion, audio extraction, video trimming, and compression, making it a versatile tool for developers. FFmpeg is widely utilized in various applications and platforms, including YouTube and Netflix, for tasks like transcoding, thumbnail generation, and adaptive bitrate streaming.

Uploaded by

Prince Allen
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

What is Ffmpeg

FFmpeg is a powerful open-source multimedia framework used to process, convert,


stream, record, and edit audio and video files from the command line.

Think of FFmpeg as a Swiss Army knife for media files 🎬🎧 — it can handle almost
anything related to video or audio.

1. What FFmpeg Can Do


FFmpeg can perform many media operations:

🎥 Convert Video Formats

Example: Convert MP4 → AVI

ffmpeg -i input.mp4 [Link]

🎵 Extract Audio from Video

ffmpeg -i video.mp4 audio.mp3

✂️ Trim Video

ffmpeg -i input.mp4 -ss [Link] -to [Link] output.mp4

Cuts video from 10s to 20s.


📉 Compress Video

ffmpeg -i input.mp4 -vcodec libx264 -crf 28 output.mp4

Reduces file size while keeping reasonable quality.

📸 Extract Frames from Video

ffmpeg -i video.mp4 frame_%[Link]

Extracts every frame as images.

🔄 Resize Video

ffmpeg -i input.mp4 -vf scale=1280:720 output.mp4

2. Major Components of FFmpeg


FFmpeg is actually made of several libraries and tools:

Component Purpose

ffmpeg Main command line tool

ffprobe Shows media file information

libavcodec Encoding & decoding audio/video

libavformat Handles file formats


Component Purpose

libavfilter Video/audio filtering

3. Where FFmpeg Is Used (Real World)


FFmpeg is widely used in many systems:

Streaming Platforms

YouTube
Netflix
Twitch

Applications

Video editing software


Screen recorders
Media players

Backend Systems

Example uses in backend services:

Generate video thumbnails


Convert user uploaded videos
Compress media before storing in S3
Create adaptive streaming (HLS/DASH)

Example pipeline:

User uploads video



Backend receives file

FFmpeg compresses + converts

Store in S3

Serve optimized video
4. Example Real Backend Use Case
Generate a thumbnail from a video:

ffmpeg -i video.mp4 -ss [Link] -vframes 1 [Link]

Meaning:

Go to 5 seconds
Capture 1 frame
Save as image

5. Why FFmpeg Is So Popular


Reasons developers love it:

✅ Supports almost all audio/video formats


✅ Extremely fast and optimized
✅ Works on Linux, Mac, Windows
✅ Open source and free
✅ Can be integrated with Java, Python, Node, Go

6. Simple Analogy
Imagine you have media files like:

video.mp4
[Link]
[Link]

FFmpeg acts like a media factory 🏭:

Input Media → Process → Output Media

Example:
video.mp4

ffmpeg processing

compressed_video.mp4
audio.mp3
[Link]

✅ In one line:

FFmpeg is a command-line tool and library used to manipulate, convert, compress,


stream, and analyze audio and video files.

If you want, I can also explain:

How YouTube uses FFmpeg internally


Top 20 most useful FFmpeg commands
How to integrate FFmpeg in Java/Spring Boot backend (very useful for system design
interviews) 🚀

How YouTube uses FFmpeg internally

YouTube does not rely only on FFmpeg, but FFmpeg is a core tool used in the video
processing pipeline for tasks like transcoding, thumbnail extraction, and format
conversion. Large companies usually build custom video pipelines on top of FFmpeg
libraries.

Let’s understand how YouTube processes a video internally step-by-step.

1. Video Upload Pipeline (High Level)


When a creator uploads a video, YouTube runs a video processing pipeline.

User Uploads Video



Upload Server

Store Raw Video (Temporary Storage)

Video Processing Pipeline

Multiple Encodings Created

CDN Distribution

Video Available to Users

FFmpeg is heavily used in the Video Processing Pipeline.

2. Step-by-Step YouTube Video Processing

Step 1 — Upload Raw Video


User uploads a video like:

my_video.mov

The file may be:

MOV
MP4
MKV
AVI
4K / 8K
Different codecs

YouTube does not stream this raw video directly.

It must convert it.

Step 2 — Video Validation


The system checks:

file format
codec
resolution
corruption

This can be done using ffprobe.

Example:

ffprobe [Link]

It returns:

Codec: h264
Resolution: 3840x2160
FPS: 30
Audio: AAC

Step 3 — Transcoding (Most Important Step)


This is where FFmpeg is heavily used.

The uploaded video is converted into multiple formats and resolutions.

Example outputs:

Resolution Codec

144p H264

240p H264

360p H264

480p H264

720p H264
Resolution Codec

1080p H264

1440p VP9

2160p (4K) VP9 / AV1

Example FFmpeg command concept:

ffmpeg -i [Link] -vf scale=1280:720 -c:v libx264 output_720p.mp4

This creates 720p version.

YouTube runs this in parallel on thousands of machines.

Original Video

┌─────────────┬─────────────┬─────────────┐
360p 720p 1080p

4. Adaptive Bitrate Streaming (ABR)


YouTube uses adaptive streaming.

This allows your player to switch quality automatically.

Example:

Slow internet → 360p


Good internet → 1080p
Very fast → 4K

To do this, videos are split into small segments (2–10 seconds).

FFmpeg can generate HLS or DASH segments.

Example:
ffmpeg -i video.mp4 -f dash [Link]

Output:

segment1.m4s
segment2.m4s
segment3.m4s
[Link]

The player downloads segments dynamically.

5. Thumbnail Generation
YouTube automatically generates thumbnails.

FFmpeg command example:

ffmpeg -i video.mp4 -ss [Link] -vframes 1 [Link]

YouTube usually generates multiple thumbnails:

[Link]
[Link]
[Link]

Then ML chooses the best thumbnail.

6. Audio Processing
Audio is extracted and encoded separately.

Example:

AAC
Opus
Example FFmpeg:

ffmpeg -i video.mp4 -vn -acodec aac audio.m4a

Why separate audio?

Because YouTube streams audio + video separately for efficiency.

7. Content ID & Processing


After transcoding:

YouTube runs AI systems for:

copyright detection
inappropriate content detection
speech recognition
subtitles generation

These operate on processed video outputs.

8. Storage System
Processed videos are stored in Google's distributed storage.

Think:

Original video
720p version
1080p version
4K version
audio streams
segments
thumbnails

Everything is stored separately.


9. CDN Distribution
Videos are distributed through Google Global CDN.

When you play a video:

User Request

Nearest CDN

Video Segments Stream

The player switches quality dynamically.

10. What Happens When You Watch a Video


Example:

You start watching.

Internet speed detected



Player requests 480p segment

Network improves

Switch to 1080p

This is Adaptive Bitrate Streaming.

11. Simplified Architecture

Upload Service

Raw Video Storage

Video Processing Cluster

┌───────────┬───────────┬───────────┐
360p 720p 1080p
↓ ↓ ↓
Segmentation (HLS/DASH)

Distributed Storage

CDN

YouTube Player

FFmpeg is mainly used in:

transcoding
thumbnail extraction
audio extraction
segment generation

12. Why FFmpeg Is Perfect for This


Reasons YouTube-like systems use it:

extremely fast
supports all codecs
highly optimized
parallelizable
works well in distributed systems

13. Interesting Scale Fact


YouTube receives roughly:

500+ hours of video uploaded every minute

Which means their system must process millions of FFmpeg jobs per day.

✅ Simple summary

YouTube uses FFmpeg in its pipeline to:

1. Analyze uploaded videos


2. Convert videos into multiple resolutions
3. Extract audio
4. Generate thumbnails
5. Split videos into streaming segments

If you want, I can also show you something very interesting:

How Netflix processes videos (much more advanced than YouTube) — including per-title
encoding and AI compression.

What is Adaptive Bitrate Streaming

Adaptive Bitrate Streaming (ABR) is a video streaming technique where the video quality
automatically changes based on the user's internet speed and device performance to
ensure smooth playback without buffering.

Instead of sending one fixed-quality video, the server provides multiple versions of the
same video, and the player dynamically switches between them.

1. Basic Idea
Imagine a video available in these qualities:

Quality Resolution Bitrate

Low 240p 300 kbps

Medium 480p 800 kbps

High 720p 1500 kbps

HD 1080p 3000 kbps

When you watch the video:


Slow internet → 240p
Medium internet → 480p
Fast internet → 1080p

The switch happens automatically during playback.

2. Why Adaptive Streaming Is Needed


Without ABR:

Video quality fixed at 1080p



Internet slows down

Video buffers continuously

With ABR:

Video playing at 1080p



Internet slows

Player switches to 480p

Video keeps playing smoothly

So ABR prioritizes smooth playback over quality.

3. How It Works Internally

Step 1 — Video is Transcoded into Multiple Bitrates

A single uploaded video is converted into multiple versions.

Example:
original_video.mp4

Transcoding

360p.mp4
720p.mp4
1080p.mp4

Tools like FFmpeg are commonly used for this step.

Step 2 — Video is Split into Small Segments

Each version is divided into small chunks (2–10 seconds).

Example:

1080p:
[Link]
[Link]
[Link]

720p:
[Link]
[Link]
[Link]

This allows switching quality between segments.

Step 3 — A Manifest File Is Created

The manifest tells the player which qualities are available.

Example (simplified):

Video:
360p - 500 kbps
720p - 1500 kbps
1080p - 3000 kbps

Common manifest formats:

HLS ( .m3u8 )
MPEG-DASH ( .mpd )

Step 4 — Player Chooses Quality Dynamically

The video player constantly checks:

internet speed
buffer size
device capability

Then selects the best quality.

Example:

Start video → 480p


Network improves → switch to 1080p
Network drops → switch to 360p

This switching happens seamlessly.

4. Example Flow (YouTube / Netflix)

User presses play



Player downloads manifest file

Player measures internet speed

Downloads first segment (480p)

Network improves

Next segment downloaded at 1080p

Notice:

Segments can have different qualities.

5. Streaming Protocols That Use ABR

HLS (HTTP Live Streaming)

Created by Apple.

Used by:

YouTube
Apple TV
iOS devices

File types:

.m3u8
.ts

MPEG-DASH

Open standard used widely.

Used by:

Netflix
YouTube
Amazon Prime

File types:

.mpd
.m4s
6. Real Example When You Watch YouTube
Suppose you open a YouTube video.

Start watching → 480p


WiFi becomes fast → 1080p
Network fluctuates → 720p

The player automatically switches between streams.

You usually don’t notice it.

7. Advantages of Adaptive Streaming


✔ Prevents buffering
✔ Works on slow internet
✔ Optimizes bandwidth usage
✔ Better user experience
✔ Supports multiple devices

8. Real Life Analogy


Imagine a car with automatic gears 🚗
.

Manual streaming:

Always drive in 5th gear


Even if road is steep

Adaptive streaming:

Road steep → shift to 2nd gear


Road smooth → shift to 5th gear

The car adjusts automatically.


✅ Simple definition

Adaptive Bitrate Streaming is a technique where video quality dynamically changes


according to network conditions to ensure smooth playback without buffering.

If you're interested, I can also explain something very useful for system design interviews:

How Netflix's Adaptive Streaming algorithm works


Why Netflix doesn't just create fixed bitrates but uses "Per-Title Encoding"
How ABR decision algorithms work inside video players (very interesting).

What is bitrates

Bitrate refers to the amount of data processed or transmitted per second in audio or
video.
It basically determines how much information is used to represent the media every
second.

Higher bitrate → better quality but larger file size


Lower bitrate → lower quality but smaller file size

1. Simple Definition
Bitrate = number of bits used per second to encode media.

Common units:

Unit Meaning

bps bits per second

kbps kilobits per second


Unit Meaning

Mbps megabits per second

Example:

Video bitrate = 3000 kbps

Means:

3000 kilobits of data are used every second of video

2. Bitrate Example (Video Quality)


Different video qualities require different bitrates.

Resolution Typical Bitrate

240p 300 kbps

360p 700 kbps

480p 1200 kbps

720p 2500 kbps

1080p 4500 kbps

4K 15000–35000 kbps

Higher bitrate → more detail preserved in the video.

3. Why Bitrate Matters


Bitrate affects three main things:
1️⃣ Video Quality

Higher bitrate keeps more visual details.

Example:

Low bitrate → blurry video


High bitrate → sharp video

2️⃣ File Size

File size depends on bitrate.

Formula:

File Size ≈ Bitrate × Duration

Example:

10-minute video

Bitrate File Size

1000 kbps ~75 MB

5000 kbps ~375 MB

Higher bitrate → larger files.

3️⃣ Internet Bandwidth

Streaming services must match bitrate with network speed.

Example:
User internet speed = 3 Mbps

Streaming a video with:

Bitrate = 6 Mbps

will cause buffering.

4. Bitrate in Streaming (YouTube Example)


A YouTube video is stored in multiple bitrates.

Example:

1080p → 4500 kbps


720p → 2500 kbps
480p → 1200 kbps
360p → 700 kbps

If your internet speed drops:

Player switches from 1080p → 480p

This is part of Adaptive Bitrate Streaming.

5. Bitrate vs Resolution
Resolution and bitrate are not the same.

Term Meaning

Resolution Number of pixels (720p, 1080p)


Term Meaning

Bitrate Amount of data used per second

Example:

Two 1080p videos:

Video A → 8000 kbps (high quality)


Video B → 2000 kbps (compressed)

Both are 1080p, but Video A looks much better.

6. Bitrate in Audio
Bitrate also applies to audio.

Audio Quality Bitrate

Low 64 kbps

Medium 128 kbps

High 256 kbps

Very High 320 kbps

Example:

Spotify:

Normal → 96 kbps
High → 160 kbps
Very High → 320 kbps

Higher bitrate = clearer sound.


7. Constant vs Variable Bitrate

CBR (Constant Bitrate)

Bitrate stays fixed.

Example:

3000 kbps for entire video

Pros:

predictable file size

Cons:

may waste bandwidth.

VBR (Variable Bitrate)

Bitrate changes depending on scene complexity.

Example:

Simple scene → 1500 kbps


Action scene → 5000 kbps

Pros:

better quality
smaller files

This is used by YouTube and Netflix.

8. Example with FFmpeg


Set video bitrate:
ffmpeg -i input.mp4 -b:v 2000k output.mp4

Meaning:

Video bitrate = 2000 kbps

9. Easy Analogy
Think of bitrate like paint used in a picture 🎨.

Low paint:

Less detail
Blurry image

More paint:

More detail
Sharper image

Bitrate is the amount of data "paint" used to draw each second of video.

✅ One-line summary

Bitrate is the amount of data used per second to encode audio or video, determining its
quality, file size, and streaming bandwidth requirements.

If you want, I can also explain a very common interview confusion:

Bitrate vs Frame Rate vs Resolution vs Codec (this confuses even senior engineers).

You might also like