Grade 10 Statistics Study Notes
Grade 10 Statistics Study Notes
What we do?
- Creating real maths content
- Guiding your decisions in maths
education. let us help you make
a decision.
Note to Students:
These materials are being shared with you to help simplify your workload, clarify key concepts,
and support deeper understanding. They are intended solely for your personal use as part of this
course. Please do not distribute, reproduce, or share these notes outside this group without my
explicit written permission. Unauthorized sharing may violate intellectual property laws and
academic policies.
While every effort has been made to ensure accuracy, I cannot guarantee that this section is
entirely free of errors. This book is designed to complement both the prerecorded and live
lessons. If you happen to spot any mistakes, please feel free to reach out using the contact
details provided below.
If anything in these notes is unclear or you'd like further guidance, feel free to reach out directly.
I'm here to support your learning journey. You can contact me via email
fptmathematics@[Link], or Message us directly. Please do not share this contact information
outside of this group.
Term 2
Topic 1: Topic 2: Topic 3:
Euclidean Analytical Functions
Geometry Geometry
Term 3
Topic 1: Topic 2: Topic 3: Topic 4:
Trigonometry Statistics Probability Finance
If data is so important then we should define and classify it. We have two types of data that we can
collect.
Quantitative data- defines a subject and is expressed as a number that can be analyzed (it can be
quantified). There are 2 types of quantitative data- Discrete (Integers) and Continuous (Real
values).
Results: Results:
Results:
Cat: 4
Dog: 5
Turtle: 3
Snake: 1
Hamster: 2
Descriptive statistics:
For example, if you survey a group of students about their test scores, descriptive statistics can
help summarize the average score, the most common score, and how widely scores vary.
The most common descriptive statistics focus on determining the “average” of the data.
Mean: sum of data divided by the number of datum in the group, denoted by x, reads as x bar.
Median: the middle most datum, when all data are arranged in ascending order.
Mode: the most common datum in the set. Data sets can have many modes or no modes.
2
ORGANISE
1 COLLECT 3 SUMMARIZE
4
REPRESENT
1. Frequency Distribution (Organize and Represent) – This shows how often each value
appears in the dataset, often represented in tables or graphs. Grade 11
2. Measures of Central Tendency (Summarize)– These describe the center of a data set:
- Mean (average)
- Median (middle value)
- Mode (most frequently occurring value)
3. Measures of Variability (Dispersion) (Summarize)– These describe how spread out the data
is:
- Range (difference between highest and lowest values)
- The five number summary (Quartiles)
MEASURES OF
DISPERSION
MEASURES
FREQUENCY
OF CENTRAL
TENDENCY
STATISTICS DISTRIBUTION
CORRELATION &
REGRESSION
ANALYSIS
Grade 12
It’s like zooming out to see the shape of the data — where it peaks, where it’s flat, and how it flows
- It tells us where the center of the data lies — like the average test score in a class.
- Common measures: Mean, Median, Mode.
- Helps summarize a large data set with a single representative value.
Purpose: To describe how spread out the data is around the center.
Two data sets can have the same average but very different spreads — variability tells that story.
Mean: 5 Scores: 1, 3, 5, 7, 9
Range: 6 − 4 = 2
Standard Deviation: Low (values are close to the Mean: 5
mean) Range: 9 − 1 = 8
Standard Deviation: High (values are spread out)
It’s like zooming out to see the shape of the data — where it peaks, where it’s flat, and how it flows
Scores:
55, 60, 60, 62, 65, 65, 68, 70, 72, 75, 75, 78, 80, 82, 85, 88, 90, 92, 95, 98
- Mean (Average):
77
- Range:
98 - 55 = 43
- Q2 (Median):
75%
- Q1 (Lower Quartile):
Lower half: 65
- Q3 (Upper Quartile):
Upper half: 86.5
These show how spread out the scores are — from the lowest to the highest, and how tightly they
cluster around the average.
- The IQR and standard deviation show how consistent or varied the performance was. Lower
standard deviations show that the data is closer to the mean, could indicate that th teacher teaches
in a way that resonates with every student.
100 100
90 90
Marks
Marks
80 80 The mean
score
70 70
60 60
50 50
0 0
Students Students
We can organise data by using tallies, frequency tables, stem-and-leaf diagrams, histograms and
frequency polygons. We can also group data into intervals
GROUPING DATA
A practical approach to organizing continuous quantitative data is to divide the entire range into
distinct intervals or classes. Since continuous values — such as height or weight — can take on
infinitely many possibilities and often don’t repeat exactly, it’s more meaningful to group them into
intervals that capture where each value belongs, rather than listing them individually.
This process involves defining non-overlapping intervals that span the full range of the data set.
Once these intervals are set, we count how many data points fall within each one. This transforms a
continuous data set into a more structured form, making patterns easier to interpret.
And importantly, this method isn’t limited to continuous data. When dealing with a large volume of
discrete data, such as survey responses with dozens or hundreds of unique values, grouping is still
essential. For example, if 150 people each gave a distinct response or score, trying to display every
single one as a separate bar in a histogram would be impractical.
One of the most effective ways to visualize this grouped data is with a histogram. In a
histogram:
Heights (cm):
152, 158, 160, 161, 162, 164, 165, 165, 166, 167,
168, 169, 170, 170, 171, 172, 173, 174, 175, 176,
177, 178, 179, 180, 181, 182, 183, 185, 187, 190
We could represent our ranges grouped class labels or set builder notation.
we could say:
Stem Leaf 12
11
15 28 10
16 0123445679 9
Frequency
17 00123456789 8
18 012357 7
19 0 6
Key: 5
15|2=52
4
11
10
9
Frequency
8
0
140<x<149 200<x<209
Height (cm)
Height ranges of participants and frequencies
- Mean (average)- add each data value and divide by the number of datum.
- Median (middle value)- arrange the data values in numerical order. The median is the middle data
value. If there is an even number of data, then find the mean of the two closest to the middle.
- Mode (most frequently occurring value)- The data value that occurs most often
These quantities describe how far apart the data points can be from each other. We use the
following to describe the dispersion of data:
- Range- Describe the span of the data, or how far the biggest and smallest value are. It is
calculated by subtracting, the min value from the max value.
- Outliers- are data point which occur individually and do not behave according to the trend
described by the rest of the data.
Standard deviation and variance are both measures of how spread out data is, but they differ in
how they quantify variability.
- Variance measures the average squared deviation from the mean. It gives a broad sense of how
much the data points differ from the mean but is expressed in squared units, making it harder to
interpret directly.
- Standard deviation is simply the square root of the variance. Since it’s in the same units as the
original data, it’s more intuitive for understanding how much individual data points deviate from
the mean.
- A high standard deviation means individual scores differ significantly from the average.
Variance measures how spread out the data is around the mean. A high variance indicates that the
data points are widely scattered, while a low variance suggests that the values are closely clustered
around the mean.
In practical terms:
- If test scores in a class have high variance, it means students performed very differently—some
scored very high, while others scored very low.
- If test scores have low variance, most students scored similarly, with little deviation from the
average.
Variance is useful for comparing datasets and understanding consistency. For example, in finance,
a high variance in stock prices suggests volatility, while a low variance indicates stability.
For Quartile 2: there is not exact middle point so we add the two middle datum and divide by 2.
For Quartile 1: Cut the data set in half, look at the lower half which may also be even, track the half
of this half. If it is even then add the middle two datum and divide by 2.
For Quartile 3: Cut the data set in half, look at the upper half which may also be even, track the
half of this half. If it is even then add the middle two datum and divide by 2.
Q1 Q3
Q2
Example 1: 38, 47, 49, 58, 60, 65, 70, 79, 80, 92
62,5
Q1 Q2 Q3
Example 2: 55, 60, 60, 62, 65, 65, 68, 70, 72, 75, 75, 78, 80, 82, 85, 88, 90, 92, 95, 98
65 75 86,5
For Quartile 2: there is an exact middle point so we can take the median
For Quartile 1 and 3: Split the data into two halves (excluding the median), add the middle two
values and divide by 2
Q2
Q1 Q3
Example: 3, 5, 7, 9, 11, 13, 15, 17, 19
6 16
1
Position of median(n) = 2
(Total no. of data + 1)
Scores obtained by 10 students are 38, 47, 49, 58, 60, 65, 70, 79, 80, 92 (already in ascending
order)
Answers:
1. 6 datum lie below 70%. The total number of datum is 10. Therefore 6/10x100=60th percentile.
This means that 60 percent of the students scored less than 70%, and 40 percent of students
scored 70% and above.
3. 25 x(10+1)
n=
100
n= 2,75
therefore, quartile 1 lies between 2 and 3. Remember n is talking about positions which have to be
natural numbers (same logic from number systems). (47+49)/2=48%
Q1 Q3
Q2
4. 38, 47, 49, 58, 60, 65, 70, 79, 80, 92
Therefore Q1 is 49%.
Quartile 1 is the same as the the 25th percentile, however you can see we yield 2 different
answers, this is becasue the percentile (48) was calculated using the percentile position formula.
This method uses interpolation between values — it’s more precise. The Quartile (49) was
calculated by splitting the data set in half and taking the median of the lower half — a more manual
method.
5. 70 percent of students scored lower than 79% and 30 percent of students scored 79% and
above.
}
IQR
IQR
1. Find Q1 and Q3
4. Identify outliers:
Any value below the lower bound or above the upper bound is considered an outlier.
Example: 45, 50, 55, 60, 65, 70, 75, 80, 85, 90
1. Find Q1 and Q3
Q1=55 and Q3=80
4. Identify outliers:
Any value below the lower bound or above the upper bound is considered an outlier.
[Link]:
- Q1 (First Quartile): The 25th percentile (middle of the lower half)
- Lower half: 45, 50, 55, 60, 65
- Q1 = 55
Q1 Q2 Q3
25% 25% 25% 25%
45 50 55 60 65 70 75 80 85 90
4. Percentiles:
- 25th percentile = Q1 = 52,5%
- 50th percentile = Q2 (Median) = 67,5%
- 75th percentile = Q3 = 82,5%
Interpretation:
- The IQR (25%) shows the spread of the middle 50% of scores.
- Quartiles help understand relative positioning in the data set (Split in quarters).
- Quartiles – Divide data into four equal parts, helping to identify where values fall within a dataset.
They are useful for comparing different sections of data and detecting skewness.
- Percentiles – Indicate the relative standing of a value within a dataset. For example, if a test score
is in the 90th percentile, it means the score is higher than 90% of all other scores. Percentiles are
widely used in standardized testing and performance evaluations.
- Interquartile Range (IQR) – Measures the spread of the middle 50% of the data: [ IQR = Q3 - Q1 ]
- A high IQR suggests greater variability.
- A low IQR indicates that most values are clustered closely.
Q₁ Q₂ Q₃
Min Max
10 20 30 40 50 60 70 80 90 100
1.1.1 x= 17+18+19+21+24+26+28+31+35+39+40+42+42+45+51+55+70+85+95
19
= 41,21%
1.1.2 17 18 19 21 24 26 28 31 35 39 40 42 42 45 51 55 70 85 95
Q1 Q2 Q3
1.1.3 42%
1.1.5
Min-17%
Q1- 24%
Q2- 39%
Q3- 51% Q₁ Q₂ Q₃
Max- 95%
Min Max
10 20 30 40 50 60 70 80 90 100
1.2
1.2.1 a= 8x15=120 (this will be significant when we find the mean. The data where already
grouped, so we don’t know the actual percentage of monthly income spent on fuel. So we find the
averages in each interval and multiply by the frequency since it exists that many times in the data
set.)
b=420/20= 21%
c= 12x27= 324
d= 264/33= 8 people
or
d=50-8-20-12-2=8 people
e= 2x39= 78
1.2.2 x= 120+420+324+264+78
50
= 24,12%
1.2.3 18<p<24
4.2 x= (25x6)+(35+16)+(45x21)+(55x8)
51
= 41,08 secs
4.3 40<x<50
Percentile
4.4 Position= (Total no. of data + 1)
100
30
= (51 + 1)
100
= 15,6
4.5
5.1 Median= 24
5.2
5.2.1 x= 10+13+15+17+18+20+23+24+26+28+28+29+39+48+49
15
= 25,8
5.2.2 49-10=39
5.2.3 IQR=29-17= 12
5.3 Q₁ Q₂ Q₃
Min Max
10 20 30 40 50
12
5.4 x 100= 80%
15
Copyright reserved by FPT Mathematics: Lushen Govender 2021