0% found this document useful (0 votes)
24 views15 pages

DL Unit1

The document provides an introduction to deep learning, highlighting its differences from machine learning and detailing the evolution of neural architectures. It explains data representations for neural networks, including various tensor types and their applications, as well as basic tensor operations in TensorFlow and NumPy. Additionally, it covers the concept of data batching and the characteristics of time series and image data in deep learning contexts.

Uploaded by

Saikiran
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views15 pages

DL Unit1

The document provides an introduction to deep learning, highlighting its differences from machine learning and detailing the evolution of neural architectures. It explains data representations for neural networks, including various tensor types and their applications, as well as basic tensor operations in TensorFlow and NumPy. Additionally, it covers the concept of data batching and the characteristics of time series and image data in deep learning contexts.

Uploaded by

Saikiran
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd

Unit-I

Introduction to Deep Learning, Deep Learning Vs Machine Learning, Neural Networks, Data
representations for neural networks, Scalars (0D tensors), Vectors (1D tensors), Matrices (2D
tensors), 3D tensors and higher-dimensional tensors, Key attributes, Manipulating tensors in
Numpy, The notion of data batches, Real-world examples of data tensors, Vector data, Time
series data or sequence data, Image data, Video data

Introduction to Deep Learning


Deep Learning is transforming the way machines understand, learn and interact with complex
data. Deep learning mimics neural networks of the human brain, it enables computers to
autonomously uncover patterns and make informed decisions from vast amounts of
unstructured data.
How Deep Learning Works?
Neural network consists of layers of interconnected nodes or neurons that collaborate to
process input data. In a fully connected deep neural network data flows through multiple
layers where each neuron performs nonlinear transformations, allowing the model to learn
intricate representations of the data.
In a deep neural network the input layer receives data which passes through hidden
layers that transform the data using nonlinear functions. The final output layer generates the
model’s prediction.
Difference between Machine Learning and Deep Learning
Machine learning and Deep Learning both are subsets of artificial intelligence but there are
many similarities and differences between them.
Machine Learning Deep Learning

Apply statistical algorithms to learn the Uses artificial neural network architecture
hidden patterns and relationships in the to learn the hidden patterns and
dataset. relationships in the dataset.

Requires the larger volume of dataset


Can work on the smaller amount of dataset
compared to machine learning

Better for complex task like image


Better for the low-label task. processing, natural language processing,
etc.
Machine Learning Deep Learning

Takes less time to train the model. Takes more time to train the model.

A model is created by relevant features Relevant features are automatically


which are manually extracted from images to extracted from images. It is an end-to-end
detect an object in the image. learning process.

More complex, it works like the black


Less complex and easy to interpret the result. box interpretations of the result are not
easy.

It can work on the CPU or requires less


It requires a high-performance computer
computing power as compared to deep
with GPU.
learning.

Evolution of Neural Architectures


The journey of deep learning began with the perceptron, a single-layer neural network
introduced in the 1950s. While innovative, perceptrons could only solve linearly separable
problems hence failing at more complex tasks like the XOR problem.
This limitation led to the development of Multi-Layer Perceptrons (MLPs). It introduced
hidden layers and non-linear activation functions. MLPs trained
using backpropagation could model complex, non-linear relationships marking a significant
leap in neural network capabilities. This evolution from perceptrons to MLPs laid the
groundwork for advanced architectures like CNNs and RNNs, showcasing the power of
layered structures in solving real-world problems.

Data representations for neural networks

In neural networks, data representations refer to how data is formatted, structured, and fed
into the model for training and inference. Choosing the right representation is crucial, as
neural networks operate on numerical data (usually tensors)
1. Scalars (0D Tensors)
 Description: A single numerical value.
 Example: 5, 0.75
 Use Case: Loss value, learning rate, or a constant input.

2. Vectors (1D Tensors)


 Description: An ordered list of numbers.
 Shape: (n,)
 Example: [3.2, 1.5, 0.9]
 Use Case: Feature vector for one data point.

3. Matrices (2D Tensors)


 Description: 2D array of numbers (rows × columns).
 Shape: (m, n)
 Example: A batch of vectors.
 Use Case:
o Input features for a batch of samples
o Weights in a fully connected (dense) layer

4. 3D Tensors
 Description: 3-dimensional arrays.
 Shape: (batch_size, height, width) or (sequence_length, features)
 Example:
o Time series data: (batch, time_steps, features)
o Image: (channels, height, width) or (height, width, channels)
 Use Case:
o Sequences (like text or time series)
o Colored images (RGB)

5. 4D and Higher-Dimensional Tensors


 4D Example: (batch_size, channels, height, width)
 5D Example: (batch_size, channels, depth, height, width) — for 3D images like CT
scans.
 Use Case:
o CNNs for image processing
o Video (as sequences of frames)

Introduction to Tensor with Tensorflow


Tensor is a multi-dimensional array used to store data in machine learning and deep learning
frameworks, such as TensorFlow. Tensors are the fundamental data structure in TensorFlow,
and they represent the flow of data through a computation graph. Tensors generalize scalars,
vectors, and matrices to higher dimensions.
Types of Tensors
Tensors in TensorFlow can take various forms depending on their number of dimensions.
 Scalar (0D tensor): A single number, such as 5 or -3.14.
 Vector (1D tensor): A one-dimensional array, such as [1, 2, 3, 4].
 Matrix (2D tensor): A two-dimensional array, like a table with rows and columns: [[1,
2], [3, 4]].
 3D Tensor: A three-dimensional array, like a stack of matrices: [[[1, 2], [3, 4]], [[5, 6],
[7, 8]]].
Higher-dimensional tensors: Tensors with more than three dimensions are often used to
represent more complex data, such as color images (which might be represented as a 4D
tensor with shape [batch_size, height, width, channels]).
How to represent Tensors in TensorFlow?
TensorFlow framework is designed for high-performance numerical computation, operates
primarily using tensors. When you use TensorFlow, you define your model, train it, and
perform operations using tensors.
A tensor in TensorFlow is represented as an object that has:
 Shape: The dimensions of the tensor (e.g., [2, 3] for a matrix with 2 rows and 3
columns).
 Rank: The number of dimensions of the tensor (e.g., a scalar has rank 0, a vector has
rank 1, a matrix has rank 2, etc.).
 Data type: The type of the elements in the tensor, such as float32, int32, or string.
 Device: The device on which the tensor resides (e.g., CPU, GPU).
TensorFlow provides a variety of operations that can be applied to tensors, including
mathematical operations, transformations, and reshaping.
Basic Tensor Operations in TensorFlow
TensorFlow provides a large set of tensor operations, allowing for efficient manipulation of
data. Below are some of the most commonly used tensor operations in TensorFlow:
1. Creating Tensors
You can create tensors using TensorFlow’s [Link]() or its various helper functions, such
as [Link](), [Link](), or [Link]():
import tensorflow as tf

# Scalar (0D tensor)


scalar_tensor = [Link](5)

# Vector (1D tensor)


vector_tensor = [Link]([1, 2, 3, 4])
# Matrix (2D tensor)
matrix_tensor = [Link]([[1, 2], [3, 4]])

# 3D Tensor
tensor_3d = [Link]([[[1, 2], [3, 4]], [[5, 6], [7, 8]]])

# Tensor of zeros (2D tensor)


zeros_tensor = [Link]([3, 3])

# Tensor of ones (2D tensor)


ones_tensor = [Link]([2, 2])

2. Tensor Operations
TensorFlow supports various operations that can be performed on tensors, such as element-
wise operations, matrix multiplication, reshaping, and more.
import tensorflow as tf

# Define a matrix tensor


matrix_tensor = [Link]([[1, 2], [3, 4]])

# Define a ones tensor


ones_tensor = tf.ones_like(matrix_tensor)

# Element-wise addition
tensor_add = [Link](matrix_tensor, ones_tensor)

# Matrix multiplication (dot product)


matrix_mult = [Link](matrix_tensor, matrix_tensor)

# Reshape a tensor
#changing shape of matrix_tensor to [4, 1]
reshaped_tensor = [Link](matrix_tensor, [4, 1])

# Transpose a tensor
# flip rows and columns of matrix_tensor
transpose_tensor = [Link](matrix_tensor)

3. Accessing Elements in a Tensor


You can access specific elements within a tensor using indices. Similar to how you access
elements in Python lists or NumPy arrays, TensorFlow provides slicing and indexing
operations.
import tensorflow as tf

# Define a vector tensor


vector_tensor = [Link]([1, 2, 3, 4])

# Accessing the first element of a vector


first_element = vector_tensor[0]

# Define a matrix tensor


matrix_tensor = [Link]([[1, 2], [3, 4]])

# Slicing a tensor (first two rows of the matrix)


matrix_slice = matrix_tensor[:2, :]

Higher-Dimensional Tensors: Key Attributes in Neural Networks


Higher-dimensional tensors (4D and above) are widely used in advanced deep learning
applications, especially in computer vision, video processing, and spatiotemporal modeling.
Here's a breakdown of their key attributes:
Key Attributes of Higher-Dimensional Tensors
Attribute Description

Rank The number of dimensions or axes in the tensor. E.g., a 4D tensor has rank 4.
Attribute Description

A tuple that defines the size along each dimension. For example: (batch_size,
Shape
channels, height, width) for a 4D image tensor.

Total number of elements in the tensor, computed as the product of all shape
Size
dimensions.

Example: 4D Tensor in Convolutional Neural Network (CNN)


# Shape: (batch_size, channels, height, width)
image_tensor = [Link](32, 3, 64, 64)

Dimension Meaning

32 Batch size

3 Channels (RGB)

64 Height of the image

64 Width of the image

Example: 5D Tensor for Video Input

# Shape: (batch_size, frames, channels, height, width)


video_tensor = [Link](8, 10, 3, 64, 64)

Dimension Meaning
8 Batch size
10 Temporal dimension (frames)
3 Channels (e.g., RGB)
64 Height
64 Width
Common Tensor Manipulation Operations in NumPy

1. Creating Tensors
import numpy as np

# 1D Tensor (Vector)
a = [Link]([1, 2, 3])

# 2D Tensor (Matrix)
b = [Link]([[1, 2], [3, 4]])

# 3D Tensor
c = [Link](2, 3, 4) # shape: (2, 3, 4)

2. Reshaping Tensors
a = [Link](12) # shape: (12,)
b = [Link](3, 4) # shape: (3, 4)

3. Transposing Tensors
a = [Link]([[1, 2], [3, 4]])
b = a.T # shape: (2, 2), transpose of matrix

# For higher dimensions


c = [Link](2, 3, 4)
d = [Link](c, (1, 0, 2)) # swaps axes

4. Expanding/Reducing Dimensions
a = [Link]([1, 2, 3])
# Expand dimensions
a_exp = np.expand_dims(a, axis=0) # shape: (1, 3)
a_exp2 = np.expand_dims(a, axis=1) # shape: (3, 1)

# Remove singleton dimensions


b = [Link]([Link]([[[5]]])) # shape: (), scalar

5. Splitting Tensors
a = [Link](12).reshape(3, 4)

# Split into 2 parts along axis 1


split = [Link](a, 2, axis=1)

Operation Function Example

Reshape reshape() [Link](2, 3)

Transpose transpose() or .T a.T

Expand Dimensions np.expand_dims() np.expand_dims(a, axis=0)

Squeeze [Link]() [Link](a)

Concatenate [Link]() [Link]((a,b), 0)

Stack [Link]() [Link]((a,b), axis=0)

Split [Link]() [Link](a, 2, axis=1)

Sum, Mean, etc. [Link](), [Link]() [Link](a, axis=1)

The Notion of Data Batches in Neural Networks


Data batching is a core concept in training neural networks. It refers to dividing the entire
dataset into smaller groups (batches) instead of feeding the whole dataset or one sample at a
time to the model.
Training
Description Pros Cons
Type

Entire dataset at once (rarely


Batch Accurate gradient High memory usage
used)
Training
Description Pros Cons
Type

Small subset (e.g., 32, 64, 128 Efficient & stable


Mini-batch Common in practice
samples) training

Noisy gradient,
Stochastic One sample at a time Frequent updates
unstable

Tensor Shape in Batches


Data Type Tensor Shape (Batch) Example

Tabular (batch_size, num_features) (64, 10)

Grayscale Image (batch_size, height, width, 1) (64, 28, 28, 1)

Color Image (RGB) (batch_size, height, width, 3) (64, 32, 32, 3)

Text (sequences) (batch_size, sequence_length) (64, 100)

Time Series Data


Time series data consists of sequential observations recorded over time at regular
intervals. It captures temporal patterns and dependencies — making it fundamentally
ordered and time-dependent.

Key Characteristics of Time Series Data

Feature Description

Temporal Order The order of data points is crucial (unlike tabular data).

Current values often depend on past values (e.g., temperature


Time Dependency
trends).

Typically collected at consistent intervals (e.g., daily, hourly, per


Fixed Intervals
second).

Univariate /
Single vs multiple variables recorded over time.
Multivariate
Examples of Time Series Data
Domain Example Variables Tensor Shape (Example)

Weather Temperature, humidity, wind speed (128, 30, 3) — 128 samples, 30 days, 3
Domain Example Variables Tensor Shape (Example)

features

(64, 50, 4) — 64 sequences of 50 days, 4


Finance Stock price, volume, volatility
features

Heart rate, blood pressure (ECG, ICU (256, 100, 2) — 256 patients, 100 time
Healthcare
data) points

Sensor readings: pressure, flow, (32, 60, 5) — 32 devices, 60 steps, 5


IoT Devices
temperature sensors

E-
Daily sales per product (1000, 365) — 1000 products, 1 year
commerce

Audio Raw waveform (speech or music) (batch_size, time_steps) — audio frames

Motion Data Accelerometer, gyroscope readings (64, 200, 6) — motion sensors over time

Image Data
Image data is one of the most commonly used types in deep learning, especially for tasks
like classification, object detection, and segmentation. Images are represented as multi-
dimensional tensors, depending on color channels and dataset structure

An image is a grid of pixels, each having:


 A position (row, column)
 A value (intensity or RGB values)
Image Data Tensor Representations
Image Type Tensor Shape (Single Image) Description

Grayscale Image (height, width) or (height, width, 1) One intensity value per pixel

RGB Image (height, width, 3) 3 channels: Red, Green, Blue

RGBA Image (height, width, 4) RGB + Alpha (transparency)

Real-World Image Data Examples

Dataset Description Shape Example (batch)

MNIST 28×28 grayscale digits (128, 28, 28, 1)

CIFAR-10 32×32 RGB images in 10 classes (64, 32, 32, 3)


Dataset Description Shape Example (batch)

ImageNet Large dataset of 224×224 RGB images (16, 224, 224, 3)

Medical Images CT, X-ray (grayscale or volumetric) (8, 256, 256, 1) or (8, 64, 256, 256)

Video Data
Video data is a sequence of images (frames) over time. It’s inherently spatiotemporal,
meaning it has both spatial (height, width) and temporal (time/frame) dimensions. Videos
are represented as high-dimensional tensors and are used in advanced tasks like activity
recognition, video classification, and object tracking.
Tensor Representation of Video Data

Tensor Component Meaning

batch_size Number of videos in a batch

frames Number of frames per video

height Height of each frame

width Width of each frame

channels 1 (grayscale) or 3 (RGB)


Common Tensor Shape
Framework Tensor Shape

TensorFlow / Keras (batch_size, frames, height, width, channels)

PyTorch (batch_size, channels, frames, height, width)


Real-World Applications of Video Data
Application Task Model Type

3D CNN, RNN,
Video Classification Predict action or scene type
Transformers

Recognize human motion (e.g., running,


Action Recognition 3D CNN, LSTM
jumping)

YOLOv5 +
Object Tracking Follow objects across video frames
SORT/DeepSORT

Surveillance Detect anomalies, motion, or events CNN + RNNs or


Application Task Model Type

Transformers

Decode spoken words from lip


Lip Reading CNN + CTC or attention
movements

Video Captioning Describe what’s happening in a video CNN + LSTM/Transformer

Reinforcement
Environment state via video feed CNN as visual encoder
Learning

You might also like