0% found this document useful (0 votes)
35 views21 pages

Predictive Analysis in Data Science

The document is a lecture outline for a course on Statistics for Data Science, focusing on predictive analysis. It covers key concepts such as probability distributions, including discrete and continuous distributions, with examples like the normal distribution. The lecture is aimed at Master 1 students specializing in Data Science and Artificial Intelligence at Constantine 2 University.

Uploaded by

maria.gherzouli
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views21 pages

Predictive Analysis in Data Science

The document is a lecture outline for a course on Statistics for Data Science, focusing on predictive analysis. It covers key concepts such as probability distributions, including discrete and continuous distributions, with examples like the normal distribution. The lecture is aimed at Master 1 students specializing in Data Science and Artificial Intelligence at Constantine 2 University.

Uploaded by

maria.gherzouli
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Constantine 2 University

Statistics for Data Science

– Lecture –
Chapter 3 : Predictive analysis

Mohamed SANDELI
Faculty of New Technologies of Information and Communication
(NTIC)
[Link]@[Link]

Concerned Students
Faculty Department Year Specialization
New Technologies IFA Master 1 Data Science and Artificial Intelligence (SDIA)

Université Constantine
Constantine 2
2 university 2024/2025. Semester 1
Chapter 3 :
Predictive analysis

Constantine 2 university
Université Constantine 2 © Mohamed SANDELI 2
Course Outline
3.1 Introduction
3.2 Probability
3.3 Conclusion

Constantine 2 university
Université Constantine 2 © Mohamed SANDELI 3
Statistics for Data Science
3.4 Probability Distribution
Definition:
A random variable in statistics is a function that assigns a numerical value to
each possible outcome of a random experiment. It quantifies the outcomes of
random phenomena and can be either discrete or continuous depending on the
nature of the values it can take.
A discrete random variable takes on specific, countable values (e.g., the
number of heads in coin tosses).
A continuous random variable can take any value within a given range or
interval (e.g., the height of individuals).

The probability distribution of a random variable describes how probabilities are


assigned to its possible values, allowing for analysis and predictions based on
these outcomes:
Constantine 2 university
Université Constantine 2 © Mohamed SANDELI 4
Statistics for Data Science
Discrete Probability Distributions
Bernoulli Distribution
Binomial Distribution
Poisson Distribution

Constantine 2 university
Université Constantine 2 © Mohamed SANDELI 5
Statistics for Data Science
Continuous Probability Distributions
Normal Distribution
Student's t-Distribution
Exponential Distribution
Uniform Distribution
Gamma Distribution
Beta Distribution
Log-Normal Distribution
Chi-Squared Distribution

Constantine 2 university
Université Constantine 2 © Mohamed SANDELI 6
Statistics for Data Science
Continuous Probability Distributions
Continuous distributions apply to scenarios where the set of possible
outcomes is uncountable, typically intervals on the real number line.
They are used for random variables that can take any value within a
given range.
Examples:
Normal Distribution: Describes data that clusters around a
mean, forming a bell-shaped curve.
Exponential Distribution: Models the time between events in a
Poisson process.
Uniform Distribution: All outcomes are equally likely within a
certain interval.

Constantine 2 university
Université Constantine 2 © Mohamed SANDELI 7
Statistics for Data Science
Continuous Probability Distributions
Normal Distribution: Describes data that clusters around a mean,
forming a bell-shaped curve.
Exponential Distribution: Models the time between events in a
Poisson process.
Uniform Distribution: All outcomes are equally likely within a
certain interval.

Constantine 2 university
Université Constantine 2 © Mohamed SANDELI 8
Statistics for Data Science
Continuous Probability Distributions
Normal Distribution:
The normal distribution, also known as the Gaussian distribution, is a continuous
probability distribution characterized by its bell-shaped curve. It is widely used in
statistics due to its natural occurrence in many real-world phenomena.
Key Characteristics:
Symmetrical: The distribution is perfectly symmetrical about its mean.
Mean, Median, and Mode: All are equal and located at the center of the
distribution.
Bell-shaped Curve: The tails of the curve approach the horizontal axis but never
touch it.
Defined by Two Parameters:
 Mean (μ): Determines the center of the distribution.
 Standard Deviation (σ): Determines the spread or width of the distribution.

Constantine 2 university
Université Constantine 2 © Mohamed SANDELI 9
Statistics for Data Science
Continuous Probability Distributions
Normal Distribution:

Constantine 2 university
Université Constantine 2 © Mohamed SANDELI 10
Statistics for Data Science
Continuous Probability Distributions
Normal Distribution:

Constantine 2 university
Université Constantine 2 © Mohamed SANDELI 11
Statistics for Data Science
Normal Distribution:
Example1:
The heights of adult men are normally distributed with the following
parameters:

Mean height: 177.8 cm .

Standard deviation: 7.62 cm.

Questions:
• Find the probability of a man being between 170.18 cm and 185.42 cm.

• Find the probability of a man being exactly 170.18 cm tall.

• Find the probability of a man being taller than 185.42 cm.

Constantine 2 university
Université Constantine 2 © Mohamed SANDELI 12
Statistics for Data Science
Normal Distribution:
Example:
The heights of adult men are normally distributed with the following
parameters:

Mean height: 177.8 cm .

Standard deviation: 7.62 cm.

Questions:
• Find the probability of a man being between 170.18 cm and 185.42 cm.

• Find the probability of a man being exactly 170.18 cm tall.

• Find the probability of a man being taller than 185.42 cm.

Constantine 2 university
Université Constantine 2 © Mohamed SANDELI 13
Statistics for Data Science
Solution:

Constantine 2 university
Université Constantine 2 © Mohamed SANDELI 14
Statistics for Data Science
Solution:

Constantine 2 university
Université Constantine 2 © Mohamed SANDELI 15
Statistics for Data Science
Normal Distribution:
Example2:
Suppose the heights of adult men are normally distributed with:

Mean (μ): 177.8 cm

Standard deviation (σ): 7.62 cm.

Questions

• Find the height x such that P(X>x)=0.05.

• Find the height x such that P(X<x)=0.05.

• Find the range of heights x<X<y that contains 90% of the population.

Constantine 2 university
Université Constantine 2 © Mohamed SANDELI 16
Statistics for Data Science
Normal Distribution:
Example2:

Constantine 2 university
Université Constantine 2 © Mohamed SANDELI 17
Statistics for Data Science
Normal Distribution:
Example2:

Constantine 2 university
Université Constantine 2 © Mohamed SANDELI 18
Statistics for Data Science
Normal Distribution:
Example2:

Constantine 2 university
Université Constantine 2 © Mohamed SANDELI 19
Statistics for Data Science
Normal Distribution:
Example2:

Constantine 2 university
Université Constantine 2 © Mohamed SANDELI 20
Statistics for Data Science
To be continued

Constantine 2 university
Université Constantine 2 © Mohamed SANDELI 21

You might also like