Statistics Fundamentals for Students
Statistics Fundamentals for Students
DATA MANAGEMENT
In this module, you will learn about descriptive statistics, which are used to
summarize and display data. After completing this module, you will know how to present
your findings once you have collected data.
This module will begin with a brief overview of the discipline of statistics and will
then quickly focus on descriptive statistics as such the summary measures as central
tendency, position, variations of data. The latter of which serves as the foundation for
statistical inference. On the side of inference, we will focus on both estimation and
hypothesis testing issues. We will also examine the techniques to study the relationship
between two or more variables; this is known as regression and correlation.
In all it’s your turn activity serves as your class standing and post assessment as
your unit test which are all required to be submitted. Use long bond papers in all your
outputs or answers.
LEARNING OUTCOMES
The learning outcomes list the module’s overall learning outcomes. The objectives
will be written under each lesson.
At the end of the module, you should be able to:
1. use appropriate statistical tools to process and manage numerical data
accurately; and
2. solve application problems correctly
PRE-TEST
Let us see how much you already know about statistics. Answer each question
below. Take note of the items that you do not yet know. Write the letter of the
correct answer on the blank provided before the number.
___1. Which of the following is not a measure of central location?
a. Mean b. median
c. variance d. mode
___2. In quartiles, central tendency median to be measured must lie in
a. first quartile b. second quartile
c. third quartile d. four quartile
___3. Arithmetic mean is 12 and number of observations are 20 then sum of all
values is
a. 8 b. 32 c. 240 d. 1.667
___4. Method used to compute average or central value of collected data is
considered as
a. measures of positive variation
1
b. measures of central tendency
c. measures of negative skewness
d. measures of negative variation
___5. Mean or average used to measure central tendency is called
a. sample mean
b. arithmetic mean
c. negative mean
d. population mean
___6. For values lie close to the mean, the standard deviations are.
a. Big b. Small c. Moderate d. None
___7. Which of the following is not a measure of dispersion?
a. Mean b. Standard deviation c. Variance d. range
___8. Which of the following is not a measure of position?
a. Decile b. quartile c. percentile d. range
___9. If most repeated observations recorded are outliers of data then mode is
considered as
a. intended measure
b. percentage measure
c. best measure
d. poor measure
___10. The mean of a sample is
a. always equal to the mean of the population
b. always smaller than the mean of the population
c. computed by summing the data values and dividing the sum by (n - 1)
d. computed by summing all the data values and dividing the sum by the
number of items
___11. In a five number summary, which of the following is not used for data
summarization?
a. the smallest value
b. the median
c. the 25th percentile
d. the mean
___12. Since mode is the most frequently occurring data value, it
a. can never be larger than the mean
b. is always larger than the median
c. must have a value of at least two
d. None of the above answers is correct.
A researcher has collected the following sample data.
5 12 6 8 5
6 7 5 12 4
___13. The median is
a. 5 b. 6 c. 7 d. 8
___14. The mode is
a. 5 b. 6 c. 7 d. 8
___15. The mean is
a. 5 b. 6 c. 7 d. 8
2
LESSON 1: SUMMARY MEASURES
Objectives:
Let’s Engage!
A. Mean
1. Arithmetic mean (or average). This is the most widely used measure of
location. It is calculated by adding the values of the observations and dividing
by the total number of observations.
𝛴𝑥ᵢ
𝜇=
𝑁
Sample mean: If a set of data X1, X2…..Xn, represents a finite sample of size N,
then the sample mean is
𝛴𝑥ᵢ
𝑥̅ =
𝑁
2. Weight mean
𝛴𝑤ᵢ𝑥ᵢ
𝑥̅𝑤 = 𝛴𝑤ᵢ
Where:
3
𝛴𝑤ᵢ𝑥ᵢ= sum of the products of the data points and their corresponding weights.
Example: Renan has the following grade. Determine his GPA(grade point average).
Subject Unit(wᵢ) Grade (Xᵢ) 𝑤ᵢ𝑥ᵢ
Filipino 2 3 87 261
English 3 84 252
Math 7 3 85 255
P.E 1 95 95
Chem 1 (lec) 3 82 246
Chem 2 (lab) 1 82 82
Philo 1 3 85 255
𝛴 17 1,446
1,446
Thus, the weight mean is: Xw= 17
=85.06
C. Mode. The mode of a set of observations is the set of observations is the value
which occurs most often or with the highest frequency. It is the least used method.
Examples:
a. The scores 1, 2, 3, 2, 4, 7, 9, 2 have a mode of 2.
b. The scores 2, 3, 6, 7, 8, 9 have no mode since no score is repeated.
c. The scores 1, 2, 2, 3, 4, 5, 2, 5, 6, 6, 7, 9, 6 have the modes 2 and 6 since they both
occur with the same highest frequency (we refer to such data as bimodal).
d. The scores 3, 4, 5, 1, 3, 2, 4, 5, 7, 10 have the modes 3, 4, and 5.
1. It requires no calculation.
2. It can be used for quantitative as well as qualitative data.
Note: the mode does not always exist. For some sets of data, there may be several values
occurring with the greatest frequency, in which case, there are more than one mode. If
there are two modes in the distribution is said to be bimodal, if more than 2 mode then
it is multi modal.
****Midrange
4
It is defined as the mean of the largest and the smallest values in a set data.
Activity 1
Answer the following and use a separate paper (long bond paper) for your answers.
1. The following data are ages of infants (in months) at which they walked alone. These
sample data were obtained from two populations, A and B.
Central tendency A B _
Mean ___________ ___________
median ___________ ___________
mode(s) ___________ ___________
midrange ___________ ___________
5
Subject Grade Units
Math 1.75 3
Physics 2.50 5
English 2.25 3
Speech 1.50 2
Statistics 3.00 4
6. An economist studying trends in gasoline prices with in a city takes sample of 30 of
the city’s gas stations, determining for each station the price per liter (in pesos) of
unleaded regular gasoline. The results are given below. Find the mean and the median
prices.
Price (in peso) frequency
54.60 1
55.64 3
56.16 1
57.20 12
57.72 8
58.24 5
B. Measures of Dispersion
Measures of Variability
A measure of variability of a set of data is a number that conveys the idea of
spread for the data set. The measures of dispersion report on how far the
values of the distribution are from the center.
A. Range
The range is the simplest measures of variability which measures distance between
the largest and the smallest values and, as such, gives an idea of the spread data set.
However, the range does not concept of deviation. It is affected by outliers but does not
consider all values in the data set. Thus it is not a very useful measure of variability.
Range (R) = highest value-lowest value
Example:
Consider these three sets of quiz scores: Find the range of each quiz scores
(i) Section A: 5 5 5 5 5 5 5 5 5 5
(ii) Section B: 0 0 0 0 0 10 10 10 10 10
(iii) Section C: 4 4 4 5 5 5 5 6 6 6
Solution:
(i) section A, the range is 0 since both maximum and minimum are 5 and 5 –
5=0
(ii) For section B, the range is 10 since 10 – 0 = 10
(iii) For section C, the range is 2 since 6 – 4 = 2
6
If the data set A has a greater MAD than the data set B, then it is reasonable to believe
that the values in data set A are more spread out (variable) than the values in set B.
Example:
Section D, with scores 0 5 5 5 5 5 5 5 5 10.
We could compute for each data value the difference between the data value and the
mean:
deviation: data value –
data value deviation squared
mean
0 0-5 = -5 (-5)2 = 25
5 5-5 = 0 02 = 0
5 5-5 = 0 02 = 0
5 5-5 = 0 02 = 0
5 5-5 = 0 02 = 0
5 5-5 = 0 02 = 0
5 5-5 = 0 02 = 0
5 5-5 = 0 02 = 0
5 5-5 = 0 02 = 0
10 10-5 = 5 (5)2 = 25
We would like to get an idea of the “average” deviation from the mean, but if we
find the average of the values in the second column the negative and positive values
cancel each other out (this will always happen), so to prevent this we square every
value in the second column.
These values (5 and 5.56) are called, respectively, the population mean of
deviations and the sample mean of deviations for section D.
***If we are unsure whether the data set is a sample or a population, we will usually
assume it is a sample, and we will round answers to one more decimal place than the
original data.
7
• Standard deviation has the same units as the original data.
• Standard deviation, like the mean, can be highly influenced by outliers.
Population variance.
2=
𝛴(𝑥ᵢ−𝜇)2
𝑁
Sample variance.
𝛴(𝑥ᵢ−𝑥̅ )2
s2= 𝑁−1
𝛴(𝑥ᵢ−𝜇)2
=√ 𝑁
𝛴(𝑥ᵢ−𝑥̅ )2
s=√ 𝑁−1
Where:
2= population variance
S2= sample variance
= population standard deviation
s= sample standard deviation
𝑥̅ = sample mean
n= sample size
𝑥ᵢ = ith observation
𝜇 = population mean
N= population size
If the data are clustered around the mean, then the variance and the standard
deviation will be somewhat small, if, however, the data are widely scattered about the
mean, the variance and the standard deviation will be somewhat large.
Notes:
1. We divide by the quantity n-1 in order to make the sample variance an
unbiased estimator of the population variance. (an estimator is unbiased if its
average value is equal to the parameter it is estimating.)
2. The sample variance uses the squares of the deviations from the mean, as this
will eliminate the effects of the signs (as was also the case when we used the
absolute value of the deviation in computing the MAD.)
3. The unit of the standard deviation is the same as that of the raw data, so it is
preferable to use the standard deviation as a measure of variability instead of
the variance.
8
Example
Computing the standard deviation for Section B =0 0 0 0 0 10 10 10 10 10, we
first calculate that the mean is 5. Using a table can help keep track of your
computations for the variance and standard deviation:
deviation: data value –
data value deviation squared
mean
0 0-5 = -5 (-5)2 = 25
0 0-5 = -5 (-5)2 = 25
0 0-5 = -5 (-5)2 = 25
0 0-5 = -5 (-5)2 = 25
0 0-5 = -5 (-5)2 = 25
10 10-5 = 5 (5)2 = 25
10 10-5 = 5 (5)2 = 25
10 10-5 = 5 (5)2 = 25
10 10-5 = 5 (5)2 = 25
10 10-5 = 5 (5)2 = 25
= √25
=5
▪ These values (25 and 5) are called, respectively, the population variance and
the population standard deviation for section B.
Assuming this data represents a sample, we will add the squared deviations, divide
by 10, the number of data values, and compute:
Sample Variance
𝛴(𝑥ᵢ−𝜇)2
s2= 𝑛−1
250
= 9
s2 =27.78
Sample Standard Deviation
𝛴(𝑥ᵢ−𝜇)2
s=√
𝑛−1
= √27.78
= 5.27
These values (27.78 and 5.27) are called, respectively, the sample variance and
the sample standard deviation for section B.
9
***If we are unsure whether the data set is a sample or a population, we will usually
assume it is a sample, and we will round answers to one more decimal place than the
original data.
Activity 2
Answer the following and use a separate paper (long bond paper) for your answers.
1. Study of the effects of smoking on sleep patterns is conducted. The measure observed
is the time, in minutes, that it takes to fall asleep, these data are obtained:
Smokers: 69.3, 56.0, 22.1, 47.6, 53.2, 48.1, 52.7, 34.4, 60.2, 43.8, 23.2,
13.8
Nonsmokers: 28.6, 25.1, 26.4, 34.9, 29.8, 28.4, 38.5, 30.2, 30.6, 31.8, 41.6, 21.1,
36.0, 37.9, 13.9
a. Find the sample mean, mode and median for each group.
b. Find the range, mean, absolute deviation, variance, standard deviation for each
group.
c. Comment on what kind of impact smoking appears to have on the time required
to fall asleep.
2. Faculty salaries for a random sample of teachers in the public school system of a
certain town were coded by dividing each salary by 1000. Find the MAD and the
standard deviation of these salaries if the coded observations are 18, 15, 21, 19, 13,
15, 14, 23, 18 and 16 pesos.
1. Percentiles
Percentiles divide the data set into one hundred equal parts. Each set of the
observation has 99 percentile and are denoted by P1, P2, …, P99.
Note:
A Percentile is a value in the data set
A percentile rank of the given value is a percent that indicates the percentage of
data is smaller than the value.
2. Deciles
The deciles divide the data set into ten equal parts. Each set of the observation
has 9 percentile and are denoted by D1, D2, …, D9.
Note: The first decile and tenth percentile are the same D1=P10,
Similarly D2=P20, D3=P30, …, D9=P90
3. Quartiles
The quartiles divide the data set into four equal parts. Each observations has
3 quartiles and they are denoted by Q1, Q2, and Q3.
The first quartile Q1 is a value in the data set that 25% of the values fall below
Q1 and 75% of all the values fall above Q1.
The second quartile Q2 is a value in the data set that 50% of the values fall
below Q2 and 50% of all the values fall above Q2.
The third quartile Q3 is a value in the data set that 75% of the values fall below
Q3 and 25% of all the values fall above Q3.
10
Note:
Q1=P25, Q2=P50, …, D3=P75
Median: the 50th percentile, 5th decile and the second quartile of the distribution
are equal to the same value and are referred to as the median. That is
Median=Q2=D5 =P50
The starting point for finding the quantile value for ungrouped data is to
arrange the data set then locate the position of the quantile and then just pick
the value from the arranged data set that corresponds to the quantile. The
position is computed as follows:
Where :
P= 1, 2 or 4 for quartiles, 1 to 10 for deciles and 1 to 100 for percentiles.
N=number of items or values
Example:
Given the ungrouped data Find the 90th percentile, 5th decile and 3rd quartile
{95, 83, 67, 86, 93, 82, 71, 86, 97, 55, 75, 88, 70, 40, 89, 79, 90, 46, 75, 66, 81, 75,
49, 50, 55, 68, 55, 70, 75, 92}
{40, 46, 49, 50, 55, 55, 55, 66, 67, 68, 70, 70, 71, 75, 75, 75, 75, 79, 81, 82, 83, 86,
86, 88, 89, 90, 92, 93, 95, 97}
11
IT’S YOUR TURN!
Activity 3
Answer the following and use a separate paper (long bond paper) for your answers.
1. For the data set below, which value is in the 75th percentile, 4th
decile and 1st quartile? {1, 3, 3, 4, 6, 7, 7, 7, 8, 9, 9, 10, 12, 15, 16,
17}
2. Which of the following data values is the 50th percentile, 8th decile
and 3rd quartile?{1.52, 5.36, 6.79, 5.21, 0.28, 6.36, 8.47, 5.52, 6.26,
5.97}
Objectives:
LET’S ENGAGE!
▪ The graph is symmetric about a vertical line through the mean of a normal
distribution.
▪ The mean, median, and are equal.
▪ The y-value of each point on the curve is the percent (expressed as a decimal) of
the data at the corresponding x-value.
▪ Areas under the curve that are symmetric about the mean are equal.
▪ The total area under the curve is equal to one.
12
Total area = 1
μ x
• A normal distribution can have any mean and any positive standard deviation.
• The mean gives the location of the line of symmetry.
• The standard deviation describes the spread of the data.
𝑥−𝑥̅ 𝑥− µ
Z= for a sample, or Z= for a population
𝑠 ℴ
13
Table of Areas Under the Normal Curve
(Adapted from [Link]
Z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
0.0 0.0000 0.0040 0.0080 0.0120 0.0160 0.0199 0.0239 0.0279 0.0319 0.0359
0.1 0.0398 0.0438 0.0478 0.0517 0.0557 0.0596 0.0636 0.0675 0.0714 0.0753
0.2 0.0793 0.0832 0.0871 0.0910 0.0948 0.0987 0.1026 0.1064 0.1103 0.1141
0.3 0.1179 0.1217 0.1255 0.1293 0.1331 0.1368 0.1406 0.1443 0.1480 0.1517
0.4 0.1554 0.1591 0.1628 0.1664 0.1700 0.1736 0.1772 0.1808 0.1844 0.1879
0.5 0.1915 0.1950 0.1985 0.2019 0.2054 0.2088 0.2123 0.2157 0.2190 0.2224
0.6 0.2257 0.2291 0.2324 0.2357 0.2389 0.2422 0.2454 0.2486 0.2517 0.2549
0.7 0.2580 0.2611 0.2642 0.2673 0.2704 0.2734 0.2764 0.2794 0.2823 0.2852
0.8 0.2881 0.2910 0.2939 0.2967 0.2995 0.3023 0.3051 0.3078 0.3106 0.3133
0.9 0.3159 0.3186 0.3212 0.3238 0.3264 0.3289 0.3315 0.3340 0.3365 0.3389
1.0 0.3413 0.3438 0.3461 0.3485 0.3508 0.3531 0.3554 0.3577 0.3599 0.3621
1.1 0.3643 0.3665 0.3686 0.3708 0.3729 0.3749 0.3770 0.3790 0.3810 0.3830
1.2 0.3849 0.3869 0.3888 0.3907 0.3925 0.3944 0.3962 0.3980 0.3997 0.4015
1.3 0.4032 0.4049 0.4066 0.4082 0.4099 0.4115 0.4131 0.4147 0.4162 0.4177
1.4 0.4192 0.4207 0.4222 0.4236 0.4251 0.4265 0.4279 0.4292 0.4306 0.4319
1.5 0.4332 0.4345 0.4357 0.4370 0.4382 0.4394 0.4406 0.4418 0.4429 0.4441
1.6 0.4452 0.4463 0.4474 0.4484 0.4495 0.4505 0.4515 0.4525 0.4535 0.4545
1.7 0.4554 0.4564 0.4573 0.4582 0.4591 0.4599 0.4608 0.4616 0.4625 0.4633
1.8 0.4641 0.4649 0.4656 0.4664 0.4671 0.4678 0.4686 0.4693 0.4699 0.4706
1.9 0.4713 0.4719 0.4726 0.4732 0.4738 0.4744 0.4750 0.4756 0.4761 0.4767
2.0 0.4772 0.4778 0.4783 0.4788 0.4793 0.4798 0.4803 0.4808 0.4812 0.4817
2.1 0.4821 0.4826 0.4830 0.4834 0.4838 0.4842 0.4846 0.4850 0.4854 0.4857
2.2 0.4861 0.4864 0.4868 0.4871 0.4875 0.4878 0.4881 0.4884 0.4887 0.4890
2.3 0.4893 0.4896 0.4898 0.4901 0.4904 0.4906 0.4909 0.4911 0.4913 0.4916
2.4 0.4918 0.4920 0.4922 0.4925 0.4927 0.4929 0.4931 0.4932 0.4934 0.4936
2.5 0.4938 0.4940 0.4941 0.4943 0.4945 0.4946 0.4948 0.4949 0.4951 0.4952
2.6 0.4953 0.4955 0.4956 0.4957 0.4959 0.4960 0.4961 0.4962 0.4963 0.4964
2.7 0.4965 0.4966 0.4967 0.4968 0.4969 0.4970 0.4971 0.4972 0.4973 0.4974
2.8 0.4974 0.4975 0.4976 0.4977 0.4977 0.4978 0.4979 0.4979 0.4980 0.4981
2.9 0.4981 0.4982 0.4982 0.4983 0.4984 0.4984 0.4985 0.4985 0.4986 0.4986
3.0 0.4987 0.4987 0.4987 0.4988 0.4988 0.4989 0.4989 0.4989 0.4990 0.4990
14
standard deviation of 5). On which exam did Bob do better, in terms of his relative
standing in the class?
Bob’s math exam score of 80 standardizes to a z-value of
80−70
Z= =1
10
That tells us his math score is one standard deviation above the class average. His
English exam score of 80 standardizes to a z-value of
80−85
Z= =-1
5
putting him one standard deviation below the class average. Even though Bob scored
80 on both exams, he actually did better on the math exam than the English exam,
relatively speaking.
To interpret a standard score, you don’t need to know the original score, the
mean, or the standard deviation. The standard score gives you the relative standing of
a value, which in most cases is what matters most. In fact, on most national
achievement tests, they won’t even tell you what the mean and standard deviation were
when they report your results; they just tell you where you stand on the distribution by
giving you your z-score.
1. Sketch the standard normal curve and shade the appropriate area
under the curve.
2. Find the area by following the directions for each case shown.
a. To find the area to the left of z, find the area that
corresponds to z in the Standard Normal Table.
• Use the table to fin the area for the Z score of 1.23
• The area to the left of Z = 1.23 or 0.8907 (shaded)
b. To find the area to the right of z, use the Standard Normal
Table to find the area that corresponds to z. Then subtract
the area from 1.
15
• Subtract to find the area to the right of Z = 1.23: 1- 0.8907 = 0.1093
(unshaded)
c. To find the area between two z-scores, find the area
corresponding to each z-score in the Standard Normal Table.
Then subtract the smaller area from the larger area.
16
LESSON 3. LINEAR REGRESSION AND CORRELATION
OBJECTIVES:
LET’S ENGAGE!
The most commonly used techniques for investigating the relationship between
two quantitative variables are correlation and linear regression. Correlation quantifies
the strength of the linear relationship between a pair of variables, whereas regression
expresses the relationship in the form of an equation.
Linear Regression
A linear regression line has an equation of the form Y = a + bX, where X is the
explanatory variable or independent variable and Y is the dependent variable. The slope
of the line is b, and a is the intercept (the value of y when x = 0).
You might also recognize the equation as the slope formula. The equation has
the form Y=a+bX, where Y is the dependent variable (that’s the variable that goes on
the Y axis), X is the independent variable (i.e. it is plotted on the X axis), b is the slope
of the line and a is the y-intercept.
(Σ𝑦)(Σ𝑥 2 ) − (Σ𝑥)(Σ𝑥𝑦)
𝑎=
𝑛(Σ𝑥 2 ) − (Σ𝑥)2
𝑛Σ𝑥𝑦 − (Σ𝑥)(Σ𝑦)
𝑏=
𝑛(Σ𝑥 2 ) − (Σ𝑥)2
17
high GWA scores and better performance in grad school, it doesn’t mean that
high GWA scores cause good grad school performance.
▪ If you attempt to try and find a linear regression equation for a set of data
(especially through an automated program like Excel), you will find one, but it
does not necessarily mean the equation is a good fit for your data. One
technique is to make a scatter plot first, to see if the data roughly fits a
line before you try to find a linear regression equation.
From the above table, Σx = 247, Σy = 486, Σxy = 20485, Σx2 = 11409, Σy2 = 40022. n
is the sample size (6, in our case).
Find a:
((486 × 11,409) – ((247 × 20,485)) / 6 (11,409) – 2472)
484979 / 7445
=65.14
18
Find b:
(6(20,485) – (247 × 486)) / (6 (11409) – 2472)
(122,910 – 120,042) / 68,454 – 2472
2,868 / 7,445
= .385225
Linear Correlation
Correlation: the degree of relationship between the variables under consideration
is measure through the correlation analysis.
For the n ordered pairs( 𝑥1 𝑦1, ), ( 𝑥2 𝑦2, ), ( 𝑥3 𝑦3, ), … , ( 𝑥𝑛 𝑦𝑛, ), the linear correlation
coefficient r is given by
𝑛Σ𝑥𝑦 − (Σ𝑥)(Σ𝑦)
𝑟=
√𝑛Σ𝑥 2 − (Σ𝑥)2 ∙ √𝑛Σ𝑦 2 − (Σy)2
Example:
Find the linear correlation coefficient for stride length versus speed of an adult man.
Round your results to the nearest hundredth.
Stride length 2.5 3.0 3.3 3.5 3.8 4.0 4.2 4.5
(m)
Speed (m/s) 3.4 4.9 5.5 6.6 7.0 7.7 8.3 8.7
19
Solution
Stride
Speed
Subject length x2 y2 xy
(m/s)y
(m)x
1 2.5 3.4 6.25 11.56 8.5
2 3 4.9 9 24.01 14.7
3 3.3 5.5 10.89 30.25 18.15
4 3.5 6.6 12.25 43.56 23.1
5 3.8 7 14.44 49 26.6
6 4 7.7 16 59.29 30.8
7 4.2 8.3 17.64 68.89 34.86
8 4.5 8.7 20.25 75.69 39.15
Σ 28.8 52.1 106.72 362.25 195.86
𝑛Σ𝑥𝑦 − (Σ𝑥)(Σ𝑦)
𝑟=
√𝑛Σ𝑥 2 − (Σ𝑥)2 ∙ √𝑛Σ𝑦 2 − (Σy)2
8(195.86) − (28.8)(52.1)
=
√8(106.72) − (28.8)2 ∙ √8(362.25) − (52.1)2
=0.993715
Activity 5
Answer the following and use a separate paper (long bond paper) for your answers.
Consider the IQ scores and the mathematics grade of freshmen in a certain university.
The Data is Given Below.
20
POST ASSESSMENT
In the following multiple choice questions, choose the correct answer. Answers on a
separate sheet of paper to be submitted.
i. 3 5 12 3 2
___6. The variance is
a. 80 b. 4.062 c. 13.2 d. 16.5
___7. If the variance of a data set is correctly computed with the formula using n - 1
in the denominator, which of the following is true?
a. the data set is a sample
b. the data set is a population
c. the data set could be either a sample or a population
d. the data set is from a census
___8. The measure of dispersion that is influenced most by extreme values is
a. the variance
b. the standard deviation
c. the range
d. the interquartile range
___9. When should measures of location and dispersion be computed from grouped
data rather than from individual data values?
21
a. as much as possible since computations are easier
b. only when individual data values are unavailable
c. whenever computer packages for descriptive statistics are unavailable
d. only when the data are from a population
___10. The descriptive measure of dispersion that is based on the concept of a
deviation about the mean is
a. the range
b. the interquartile range
c. both a and b
d. the standard deviation
___11. The standard normal distribution has a mean
a. Less than zero
b. Equal to zero
c. Greater than zero
d. Equal to 1
___12. The total area of the distribution is
a. exactly equal to one
b. Less than one
c. Equal to zero
d. Greater than zero
___13. Which of the following is not a property of a normal distribution
a. The graph is symmetric about a vertical line through the mean of a
normal distribution.
b. The mean, median, and are not equal.
c. The y-value of each point on the curve is the percent (expressed as a
decimal) of the data at the corresponding x-value.
d. Areas under the curve that are symmetric about the mean are equal.
e. The total area under the curve is equal to one.
___14. A survey conducted shows the LEC performance of the students and their GWA
during their undergraduate studies. Find if there is a significant relationship
between the two variables. (Correlation Analysis)
GWA 80 85 89 76 83 89 84 86 95 81 76 83 89 84 86
Lec
75 76 87 74 78 89 80 87 96 79 74 78 89 80 87
Grd
___15. Given the data on the problem solving performance of a first year college
student and their mathematics grade in their senior high. Find if there is a
significant relationship between the two variables using Regression analysis and
Find the model.
SHMG 80 85 89 76 83 89 84 86 95 81 80 85
PSP 75 76 87 74 78 89 80 87 96 79 75 76
REFERENCES
22