Statistics Concepts and Methods Guide
Statistics Concepts and Methods Guide
3
Statistics
Syllabus :
Measures of central tendency, Standard deviation, Coefficient of variation, Moments,
Skewness and Kurtosis, Curve fitting : fitting of straight line, parabola and related
curves, Correlation and Regression, Reliability of Regression Estimates.
Definition :
Statistics is the science which deals with methods of collecting, classifying,
Presenting, comparing numerical data collected to throw light on any sphere of enquiry.
Variable (or Variate) :
A quantity which can vary from one individual to another is called a variable or variate.
e.g. Heights, weights, ages, wages of persons, rain fall records of cities, Income.
Quantities which can take any numerical value within a certain range are called continuous
variables.
e.g. Height, weight, temperature, time, As the child grows, his/her height takes all
possible values from 50cm to 100cm. No. of rooms in a house.
Quantities which are incapable of taking all possible values are called discrete or
discontinuous variables.
[Discrete: The variable which can assume only particular values are called as discrete
variables. e.g. No. of children in a family, No. of defective in a lot.]
e.g. No. of workers in a factory, No. of defective products, the no. of telephone calls on
different dates.
Ungrouped data: The data does not give any useful information, it is rather confusing
to mind, these are called raw data or ungrouped data.
Gigatech Publishing House
Igniting Minds
Engineering Mathematics - III 3.2 Statistics
1. Arithmetic Mean:
a) In case of individual observations: (i.e. where frequency is not given)
i) Direct Method: If the variable ‘x’ takes the values x1, x2, x3,…..,xn then A.M.
x is given by
1 1
x = n (x1 + x2 + x3 + ⋯ + xn) = n ∑x
ii) Short cut method (or shift of origin) : Shifting the origin to an arbitrary point ‘A’
then A.M.
x is given by
1
x = A + n ∑d
then
f1 x1 + f2 x2 +⋯ + fn xn 1
x= = N ∑fx where N = f1 + f2 + f3 + ⋯ + fn
f1 + f2 + f3 + ⋯ + fn
ii) Short cut method (or shift of origin): Shifting the origin to an arbitrary point ‘A’
then A.M.
x is given by
1
x = A + N ∑ fd where deviation d = x-A.
w1 x1 + w2 x2 + w3 x3 + ⋯ + wn xn
xW =
w1 + w2 + w3 + ⋯ + wn
∑wx
= ∑w .
n1
x 1 + n2 x 2 + ⋯ + nk
xk ∑n
x
x= = ∑n
n 1 + n2 + ⋯ + nk
2. Median:
i) Median is the measure of central value of the variable when the values are arranged
in ascending or descending order of magnitude.
(Median divides the distribution into two equal parts)
e.g. 3, 4, 4, 5, 6, 7, 8, 3, 4, 4, 5, 7, 9, 11, 13, 15, 17.
7+9
Median = 5, 2 = 8
ii) For an ungrouped frequency distribution if the ‘n’ values of the variate are
arranged in ascending or descending order of magnitude.
n + 1 th
a) when n = odd the middle value i.e. 2 value gives the median.
n th th
when n = even there are two middle values 2 and 2 + 1 . The
n
b)
arithmetic mean of these two values gives the median.
iii) For a grouped frequency distribution the median is given by the formula:
Median = L + f 2 - C where L = lower limit of median class, where median
h N
N
class is the class corresponding to cumulative frequency just greater than 2 .
are the frequencies of the classes preceding and succeeding the modal
class respectively. L = lower limit, h = length of the interval.
iii) where mode is ill-defined i.e. where the method of grouping also fails, its value can
be ascertained by the formula Mode = 3median – 2 mean.
This measure is called the empirical mode.
Mean – Mode = 3[mean - median]
Harmonic Mean: Harmonic mean of a number of observations is the reciprocal of
the arithmetic mean of the reciprocals of the given values. Thus the harmonic mean
H of ‘n’ observations x1, x2, x3, …xn is
1 n
H = 1 1 =1
1 1 1
n ∑ x x1 + x2 + x3 + ⋯+ xn
if x1, x2, x3, …xn (none of them being zero) have the frequencies f1, f2, f3, …fn
respectively the harmonic mean is given by
1 N
H= 1 f =f
1 f2 f3 fn .
∑ + + + ⋯+
N x x1 x2 x3 xn
3.2 Measures of Dispersion :
Dispersion:
The variation or scattering or deviation of the different values of a variable from their
average is known as dispersion. Dispersion indicates the extent to which the values vary
among themselves.
Distribution A 75 85 95 105 115 125
Distribution B 10 20 30 70 180 290
600
Arithmetic Mean of each distribution is 6 = 100.
In distribution A, the values of the variate differ from 100 but the difference is small, In
distribution B the values(or items) are widely scattered and lie far from the mean. Although the
A.M. is the same, yet the two distribution widely differ from each other in their formation.
The following are the Measures of Dispersion:
i) Range
ii) Quartile deviation or semi inter quartile deviation
iii) Average (or Mean) deviation
iv) Standard deviation
Gigatech Publishing House
Igniting Minds
Engineering Mathematics - III 3.6 Statistics
i) Range : Range is the difference between the extreme values of the variate.
Range = L – S where L = largest S = smallest.
L-S
Co-efficient of Range = .
L+S
ii) Average deviation or Mean deviation: if x1, x2, x3, …xn occurs f1, f2, f3, …fn times
respectively and N = ∑f the mean deviation from the average A(usually Mean or
Median) is given by
1
Mean Deviation = N ∑f | x – A | where | x – A | represents the modulus or the
absolute value of the deviation (x - A).
Mean Deviation
Co-efficient of mean deviation = Average from which it is calculated
Thus S. D. =
1 2
N ∑f (x - x)
Note: The Square of the S.D. (i.e. 2) is called Variance.
Variance = (S.D.)2 = σ2.
Short- cut methods for calculating standard deviation:
2
i) Direct Method: =
1
∑fx2
- 1 fx
N N
ii) Change of origin:
Let the origin be shifted to an arbitrary point ‘A’ and d = x – A then
2
∑fd2 - fd
1 1
=
N N
iii) Shift of origin and change of scale (or step deviation method):
Let the origin be shifted to an arbitrary point ‘A’ and the new scale be times the
2
x-A 1 1 fu
h then = h
original scale let u = ∑fu2
-
N N
Gigatech Publishing House
Igniting Minds
Engineering Mathematics - III 3.7 Statistics
Coefficient of Variation:
The ratio of the S.D. to the mean i.e. is as the coefficient of variation. As this is ratio
x
having no dimension. It is used for comparing the variations between two groups with
different means.
σ
C.V. = × 100
x
Illustrative Examples
Example : 1
The mean yearly salary of employees of a company was Rs.20,000 the mean yearly
salaries of male and female employees were Rs.20,800 and Rs.16, 800 respectively. Find out
the percentage of males and females employed by the company.
Solution :
Let P1 and P2 represent percentage of males and females respectively.
Example : 2
10 – 20 15 4 -4 -16
20 – 30 25 8 -3 -24
30 – 40 35 12 -2 -24
40 – 50 45 16 -1 -16
50 – 60 55 15 00 0
60 – 70 65 10 1 10
70 – 80 75 8 2 16
80 – 90 85 5 3 15
90 – 100 95 3 4 12
f = 86 fu = - 52
Mean
h 10
x = A + N fu = 55 + 86 (- 52) = 48.95 marks.
Example : 3
Example : 4
Here Median = L + - c = 50 +
h N 10
(77 - 66) = 52.62.
f 2 42
Example : 5
= L + f 2 - c
h N
Here Median
10
= 50 + (124.5 - 94) = 59.24 marks.
33
Example : 6
Example : 7
Example : 8
Find the Mean deviation from the median of the following frequency Distribution,
Marks Mid value (x) f c.f. | x – Md | f | x – Md |
0-10 5 5 5 23 115
10-20 15 8 13 13 104
20-30 25 15 28 3 45
30-40 35 16 44 7 112
40-50 45 6 50 17 102
f = 50 f |x – Md | = 478
Here
N 50
2 = 2 = 25 Median class corresponds to c.f. 28 i.e.
Median class is 20 – 30.
h N 10
Median Md = L + - C = 20 + (25 - 13) = 28.
f 2 15
1 478
Mean Deviation from Median = ∑f | x - Md | = = 9.56.
N 50
Example : 9
Mean
h 5
x = A + N ∑fu = 47.5 + 143 × 5 = 47.7
2
∑fu2 - ∑fu
1 1
S. D. = x = hu = h
N N
1003 5 2
= 5
143 - 143 =13.2
Example : 10
Mean
h 5
x = A + N ∑fu = 37.5 + 500 × (- 635) = 31.15
2
1 2 1
S. D. = x = hu = h ∑fu - ∑fu
N N
2435 6352
= 5
500 - - 500 = 9.0237187
Example : 11
0 – 10 5 5 –2 – 10 20
10 – 20 8 15 –1 –8 8
20 – 30 15 25 0 0 0
30 – 40 16 35 1 16 16
40 – 50 6 45 2 6 24
N = 50 4 68
Mean
h 10
x = A + N ∑fu = 25 + 50 × 4 = 25.8
2
1 2 1
S. D. = x = hu = h ∑fu - ∑fu
N N
68 4 2
= 10
50 - 50
= 11.634432
Example : 12
Mean
h 10
x = A + N ∑fu = 25 + 100 × 5 = 25.5
2
1 2 1
S. D. = x = hu = h ∑fu - ∑fu
N N
129 5 2
= 10 - = 11.346806
100 100
Example : 13
Calculate the Mean and Standard Deviation of the following data giving the age
distribution of 542 members.
x - 55
Age in years No. of members(f) Mid-values u=
10 fu fu2
20 – 30 3 25 –3 –9 27
30 – 40 61 35 –2 – 122 244
40 – 50 132 45 –1 – 132 132
50 – 60 153 55 0 0 0
60 – 70 140 65 1 140 140
70 – 80 51 75 2 102 204
80 – 90 2 85 3 6 18
N = 542 – 15 765
Mean
h 10
x = A+ ∑fu = 55 + × (- 15) = 54.723247
N 542
2
∑fu2 - ∑fu
1 1
S. D. = x = hu = h
N N
765 - 152
= 10 542 - 542 = 11.877176
Example : 14
Mean
h 10
x = A+ ∑fu = 25 + × (22) = 29.074
N 54
2
1 2 1
S. D. = x = hu = h ∑fu - ∑fu
N N
108 222
= 10 54 - 54 = 13.54
S.D. 13.54 is quite a large value and A.M. 29.074 is not a good average.
54
(The mean = 6 = 9 it is distorted by the usually high labourers compared to other
labourers.) or S.D. is very much deviated from arithmetic mean therefore A.M. is not
good average.
Example : 15
2 2
1 2 1 1 2 1
A = ∑fd - ∑fd B = ∑fd - ∑fd
N N N N
138 502 94 322
A = B =
53 - - 53 = 1.31 40 - - 40 = 1.3
A 1.31 B 1.3
c. v. = 100 = 1.06 100 c. v. = 100 = 1.2 100
x
x
= 123.6% = 108.3%
Since (c.v.)B < (c.v.)A Therefore Team-B is more consistent.
Example : 16
12 47 -39 -4 1521 16
115 12 64 -39 4096 1521
6 16 -45 -35 2025 1225
73 42 22 -9 484 81
1 240
y = A + n ∑d = 51 - 10 = 27
2
A = 1 ∑d2 - 1 ∑d
n n
2
=
1
(17508) - - 10
10 10
= 41.83, C.V. = 83.6%
2
B = 1 ∑d2 - 1 ∑d
n n
2
=
1
(9302) - - 240
10 10
= 18.82, C.V. = 69.6%
A.M. of A > A.M. of B, ( ) ( ) , B is more consistent.
3.3 Moments, Skewness, Kurtosis :
Moments:
The rth moment about any point A is denoted by r′ and is defined as
1
r′ = N ∑f (x - A)r where N = ∑f. It can be seen that putting r = 0, 1, 2, 3, 4…..etc. we
get.
1
3′ =
N ∑f (x - A) and so on.
3
0 = N ∑f = 1 , μ1 = N ∑f (x -
1 1
x) = 0,
∑f (x -
1
μ2 = x)2, this gives the variance of the distribution.
N
μ3 = N ∑f (x -
1
x)3, this gives the third moment of the distribution about the
let d = x – A
1 1 A
N ∑fd = N ∑fx - N ∑f or d = x – A = 1
′
∑f (d - r
1
Thus r = r
d) Expand (d - d ) binomially we get
N
r =
1 r r-1 r r-2 2 r r
N ∑f (d - c1 d d + c2 d (d) + ⋯+ (-1) (d) )
r
r,
1
N ∑fd = d = 1′ , we seen that, μ0 = 1, μ1 = 0
r
where
If the frequency curve stretches to the right as in fig.(a) i.e. the mean is to the right of the
mode then the distribution is right skewed or is said to have positive skewness.
If the curve stretches to left of mode is to the right of the mean then the distribution is
said to have negative skewness.
The different measures of skewness are:
3 (mean-median)
i) Pearson’s coefficient of skewness . = standard deviation
μ23
ii) Coefficient of skewness 1 =
μ32
Kurtosis:
To get complete idea of the distribution in addition to the knowledge of mean dispersion
and skewness, we should have an idea of the flatness or Peakedness of the curve. It is
μ4
measured by the coefficient 2 is given by β2 = μ2 .
2
The curve of fig.(a) which is neither flat nor peaked is called the normal curve or
Mesokurtic curve γ = β2 - 3. Gives the excess of kurtosis. For a normal distribution β2 = 3 and
the excess is zero. The curve of fig.(c) which is flatter than the normal curve is called
Platykurtic and that of fig.(b) which is more peaked is called Leptokurtic. For Platykurtic
curves β2 < 3. For Leptokurtic curves β2 > 3.
(Skewness: Measures the degree of asymmetric or the departure from symmetry.
Kurtosis: Measures the degree of Peakedness of a distribution.)
Illustrative Examples
Example : 1
If the first four moments of a distribution about the value 5, are equal to -4, 22, -117 and
560, determine the central moments (β1) and (β2).
Solution :
The first four moments about the arbitrary origin 5 are
1′ = - 4, 2′ = 22, 3′ = - 117, 4′ = 560
1′ =
1 1
N ∑f (x – 5) = N ∑fx – 5 = x - 5
∴ Mean =
x = μ1′ + 5 = - 4 + 5 = 1
μ2 = μ2′ - (μ1′ )2 = 22 - (- 4)2 = 6
μ3 = μ3′ - 3μ2′ μ1′ + 2(μ1′ )3
= -117 - 3(22) (- 4) + 2(- 4)3 = 19.
μ4 = μ4′ - 4μ3′ μ1′ + 6μ2′ (μ1′ )2 - 3(μ1′ )4
Gigatech Publishing House
Igniting Minds
Engineering Mathematics - III 3.19 Statistics
Example : 2
Calculate the first four moments about the mean of the given distribution. Also find
skewness 1 and kurtosis (2).
x 2.0 2.5 3.0 3.5 4.0 4.5 5.0
f 4 36 60 90 70 40 10
Solution :
x-A
Taking A = 3.5 and u =
h
x-3.5
=
0.5
we prepare the table for calculating 1′ , 2′ , 3′ , 4′ , 1, 2.
x - 3.5
x f u=
0.5 fu fu2 fu3 fu4
= 310
hr ∑fur hr
r′ =
∑f = N ∑ fu
r
Here
h 0.5
Now, μ1′ = N ∑fu = 310 × 36 = 0.058064
h3 (0.5)3
3′ = ∑fu3
=
N 310 × 204 = 0.08225
h4 (0.5)4
4′ = ∑fu4
=
N 310 × 2480 = 0.5
∴ μ1 = 0,
μ2 = 2′ – (μ1′ )2 = 0.451612 – (0.058064)2 = 0.44824
μ3 = 3′ - 32′ μ1′ + 2(μ1′ )3 = 0.08225 - 3(0.451612) (0.058064) + 2(0.058064)3
= 0.0039826
μ4 = 4′ - 43′ μ1′ + 62′ (μ1′ )2 - 3(μ1′ )4
= 0.5 - 4(0.08225)(0.058064) + 6(0.451612)(0.058064)2 - 3(0.058064)4
μ4 = 0.48999.
μ32 (0.0039826)2
Coefficient of skewness = 1 = μ3 = (0.44824)3 = 1.76549 10- 4.
2
μ4 0.48999
Kurtosis = 2 = = = 2.43874.
μ22 (0.44824)2
Example : 3
Calculate the first four moments of the following distribution about the mean and hence
find skewness 1 and kurtosis (2).
Solution : First we calculate the moments about assumed mean x = 4.
x f d=x-4 fd fd2 fd3 fd4
0 1 -4 -4 16 -64 256
1 8 -3 -24 72 -216 648
2 28 -2 -56 112 -224 448
3 56 -1 -56 56 -56 56
4 70 0 0 0 0 0
5 56 1 56 56 56 56
6 28 2 56 112 224 448
7 8 3 24 72 216 648
8 1 4 4 16 64 256
N = 256 0 512 0 2816
1 1
We know that, r′ =
N ∑f (x – A) = N ∑ fd .
r r
1
1′ =
N ∑ fd = 0
1 512
2′ =
N ∑ fd = 256 = 2
2
1
3′ =
N ∑ fd = 0
3
1 2816
4′ =
N ∑ fd = 256 = 11.
4
By using the relation between r and r′ we find four moments about the mean are.
r = 0,
2 = 2′ – (1′ )2 = 2 – 0 = 2
3 = 3′ - 31′ 2′ + 2(1′ )3 = 0 – (3) (2) (0) + 2(0) = 0
4 = 4′ - 43′ 1′ + 62′ (1′ )2 - 3(1′ )4
= 11 - (4) (0) (0) + (6) (2) (0) - (3) (0) = 0
μ32 (0)2
Coefficient of skewness = 1 = μ3 = (2)3 = 0
2
μ4 11
Kurtosis = 2 = μ2 = (2)2 = 2.75
2
Example : 4
The first three moments of a distribution about the value 2 of a distribution are 1, 16 and
-40. Find the mean, standard deviation and skewness of the distribution.
Solution : The first three moments about the arbitrary origin 2 are
1′ = 1, 2′ = 16, 3′ = - 40.
1′ =
1 1
N ∑ f (x – 2) = N ∑ fx – 2 = x – 2
∴ Mean =
x = μ1′ + 2 = 1 + 2 = 3
2 = 2′ - (1′ )2 = 16 – (1)2 = 15
∴ S.D. = σ = 15 = 3.873
3 = 3′ - 32′ μ1′ + 2 (1′ )3
= - 40 – 3(16) (1) + 2(1)3 = - 86
μ23 (- 86)2
Coefficient of skewness = 1 = μ3 = (15)3 = 2.19.
2
Example : 5
The first four moments about the working mean 30.2 of a distribution are 0.255, 6.222,
30.211 and 440.25. Calculate the moments about the mean.
Also evaluate skewness (1) and kurtosis (2), and comment upon the skewness and
kurtosis of the distribution.
Solution : The first four moments about the arbitrary origin 30.2 are
1′ = 0.255, 2′ = 6.222, 3′ = 30.211, 4′ = 440.25
1 1
1′ = ∑ f (x – 30.2) = ∑ fx – 30.2 = x - 30.2
N N
∴ Mean = x = μ1′ + 30.2 = 0.255 + 30.2 = 30.455
2 = 2′ - (1′ )2 = 6.222 – (0.255)2 = 6.15698
3 = 3′ - 32′ μ1′ + 2 (1′ )3
= 30.211 - 3(6.222) (0.255) + 2(0.255)3 = 25.48433
4 = 4′ - 43′ 1′ + 62′ (1′ )2 - 3(1′ )4
= 440.25 - 4 (30.211) (0.255) + 6(6.222) (0.255)2 - 3(0.255)4
4 = 411.8496.
μ23 (25.48433)2
Coefficient of skewness = 1 =
μ32 = (6.15698)3 = 2.78255
μ4 411.8496
Kurtosis = 2 =
μ22 = (6.15698)2 = 10.86434.
γ1 = β1 = 1.6681
This indicates considerable skewness of the distribution γ2 = β2 – 3 = 7.86434
This shows that the distribution is leptokurtic. (because β2 > 3).
3.4 Correlation :
If the change in one variable affects a change in the other variable the variables are said
to be correlated and the relation between them is called correlation.
If the two variables deviate in the same direction i.e. If the increase (or decrease) in one
results in a corresponding increase (or decrease) in the other, correlation is said to be direct or
positive.
e.g. The correlation between income and expenditure is positive.
If the two variable in opposite directions i.e. If the increase (or decrease) in one results in
a corresponding decrease (or increase) in the other, correlation is said to be inverse or negative.
e.g. i) the correlation between price and demand is negative
ii) The correlation between volume and the pressure of a perfect gas is negative. If the
deviation in one variable is followed by a corresponding proportional deviation in
the other is said to be perfect correlation.
3.5 Karl Pearson’s Coefficient of Correlation:
(Or Product Moment Correlation Coefficient):
Correlation coefficient between two variables x and y, usually denoted by r(x, y) or rxy is
a numerical measure of relationship between them and is defined as:
1 2 2
σx =
n ∑x - (x) ,
1 2 2
σy = ∑y - (y)
n
x =
1
∑ x,
1
y = ∑ y. n = no. of values (or entries)
n n
Correlation formulae:
i) Frequency is not given: put
u = x - a, v = y - b,
u = n ∑ u,
1 1
v=n∑v
1 1
n = no. of values (or entries) 2u = n ∑ u2 – (u)2, 2v = n ∑ v2 – (v)2,
1 cov(u,v)
cov(u, v) = n ∑uv - u v , r = σ σ .
u v
1 1 cov(u,v)
2v = n ∑ fv2 – (v)2, cov(u, v) = n ∑ fuv - u v , r = σ σ .
u v
Illustrative Examples
Example : 1
10 18 -10 -3 100 9 30
14 12 -6 -9 36 81 54
18 24 -2 3 4 9 -6
22 6 2 -15 4 225 -30
16 30 6 9 36 81 54
30 36 10 15 100 225 150
120 126 0 0 280 630 252
where
x =
1
∑x =
120
= 20
n 6
1 120
y = n ∑y = 6 = 21
∑XY 252
r = 2 = = 0.6
∑X ∑Y 280 630
2
Example : 2
x y X=x-
x Y=y-
y X2 Y2 XY
y =
1
∑y =
323
n 10 = 32.3
∑XY - 126.5
r = 2 = = 0.9221485
∑X ∑Y 82.5 228.1
2
Example : 3
Calculate karl pearson’s coefficient of correlation from the following data, taking 100
and 50 as assumed averages of x and y respectively.
x 104 111 104 114 118 117 105 108 106 100 104 105
y 57 55 47 45 45 50 64 63 66 62 69 61
Solution :
We construct a table as follows:
x y X = x - 100 Y = y - 50 X2 Y2 XY
104 57 4 7 16 49 28
111 55 11 5 121 25 55
104 47 4 -3 16 9 -12
114 45 14 -5 196 25 -70
118 45 18 -5 324 25 -90
117 50 17 0 289 0 0
105 64 5 14 25 196 70
1 84
y = n ∑y = 12 = 7.
cov(x, y) = n ∑ x y - x
1
y
1
= 12 (312) – (8) (7) = 26 – 56 = - 30
1 2 2
σx =
n ∑x - (x)
1 2
=
12 (1128) - (8) = 94 - 64 = 30 = 5.48
1 2 2
σy = n ∑y - (y)
1 2
=
12 (1380) - (7) = 115 - 49 = 66 = 8.124
cov(u,v) 30 30
r =
σu σv = - (5.48) (8.124) = - 44.52 = - 0.674.
Example : 4
Example : 5
x y X=x-
x Y=y-
y X2 Y2 XY
Example : 6
Solution :
We construct a table as follows:
x y f u = x - 19 v = y - 21 fu fv fu2 fv2 fuv
5 7 6 -14 -14 -84 -84 1176 1176 1176
9 9 9 -10 -12 -90 -108 900 1296 1080
15 14 13 -4 -7 -52 -91 208 637 364
19 21 20 0 0 0 0 0 0 0
24 23 16 5 2 80 32 400 64 160
28 29 11 9 8 99 88 891 704 792
32 30 7 13 9 91 63 1183 567 819
82 44 -100 4758 4444 4391
1 100
v = N ∑f v = - 82 = - 1.2196
cov(u,v) 54.20
r =
σu σv = - (7.598) (7.26) = 0.9825.
2u = N ∑ fu2 – (
1 1
u)2 = 82 (4758) – (0.5366)2 = 57.7364
∴ σu = 7.598
2v =
1 2 1
N ∑ fv – (v) = 82 (4444) – (- 1.2196) = 52.708
2 2
∴ σv = 7.26
1 1
N ∑fuv - u v = 82 (4391) - (0.5366) (- 1.2096) = 54.20
cov(u, v) =
3.6 Regression :
Regression is the estimation or prediction of unknown values of one variable from
known values of another variable. i.e. One is interested to know the nature of relationship
between the two variables.
Lines of Regression:
Let the equation of line of regression of y on x by
y = a + bx……..(i) then y = a + b
x …….(ii) ∴ y - y = b(x -
x)……..(iii)
The normal equations are ∑y = na + b∑x, ∑xy = a∑x + b∑x2, ……..(iv)
∑ (x - 0, N ∑ (x -
1
x) = x)2 = 2x, from equation (v)we get
σy
nrσx σy = a(0) + bn σ2x , ∴ b = r .
σx
σy σx
Now byx. bxy = r . r then r = byx bxy .
σx σy
Note :
i) If r = 0 the two lines of regression becomes x =
x and y =
y these are two straight lines
parallel to X and Y axes respectively, and passing through their means
x and y they are
mutually perpendicular.
ii) If r = + 1 the two lines of regression will coincide.
Illustrative Examples
Example : 1
If be the acute angle between two regression lines in the case of two variables x and y
1 - r2 σx σy
show that tan θ = r . where r, σx, σy have their usual meaning, Explain the
σ2x + σ2y
significance when r = 0 and r = ± 1.
Solution :
The lines of regression are (y -
y) = byx (x -
x)……….(1)
(x -
x) = bxy(x -
x) ∴ (y - y) = b (x -
1
x)………..(2)
xy
σy 1 σy m2 - m1
Here m1 = r σ and m2 = r σ , now, tan θ = 1 + m m
x x 1 2
1 σy σy σy 2
.σ
r σx -r σx 1- r2 σx x 1- r2 σx σy
tan θ = = . 2 = ………..(3)
σy 1 σy r σx + σ2y r σ2x + σ2y
1 + r σ .r σ
x x
i) If r = 0, then there is no relationship between the two variables and they are independent.
π
on putting the value r = 0 in(3) we get tan θ = 0, θ = 2 so the lines (1) and (2) are
perpendicular.
ii) If r = 1 or - 1, on putting these values of r in (3) we get tan θ = 0 or θ = 0 i.e. lines (1)
and (2) coincide. The correlation between the two variables is perfect.
Example : 2
Calculate the coefficient of correlation, obtain the least square regression line y on x for
the following data.
x 1 2 3 4 5 6 7 8 9
y 9 8 10 12 11 13 14 16 15
Also obtain an estimate of y which should correspond on the average to x = 6.2.
Solution : We construct a table as follows:
x y u=x-
x v=y-
y u2 v2 uv
1 9 -4 -3 16 9 12
2 8 -3 -4 9 16 12
3 10 -2 -2 4 4 4
4 12 -1 0 1 0 0
5 11 0 -1 0 1 0
6 13 1 1 1 1 1
7 14 2 2 4 4 4
8 16 3 4 9 16 12
9 15 4 3 16 9 12
45 108 0 0 60 60 57
The coefficient of correlation is given by
x = n ∑x = 9 = 5,
1 45 1 108
y = n ∑y = 9 = 12.
∑uv 57
r = = = 0.95.
∑u2 ∑v2 (5) (12)
1 2 2 1
σu =
n ∑u - (u) =
2
9 (60) - (0) .
1 2 2 1
σu = 2.582. σv = n ∑v - (v) =
2
9 (60) - (0) = 2.582.
σv (0.95)(2.582)
bvu = rσ = = 0.95
u 2.582
The line of regression v on u is v - v = bvu (u - u)
∴ [(y - y) - 0] = 0.95 [(x - x) - 0]
y - 12 = (0.95)(x - 5) ∴ y = 0.95x + 7.25
when x = 6.2 then y = 0.95(6.2) + 7.25 = 13.14. then.
Example : 3
18 214 18 8 18
x = 40 y + 40 , bxy = 40 then r = byx × bxy =
10 × 40 = 0.6
Given variance of x is σ2x = 9, S.D. = σx = 3. To find, S.D. of y is
σx 8 3
σy = byx r = 10 × 0.6 = 4.
Example : 4
Example : 5
Example : 6
Following are the marks of students in a particular subject. Find the average marks.
Marks 10 - 12 12 - 14 14 - 16 16 - 18 18 - 20 20 - 22 22 - 24 24 - 26
Students 3 6 10 15 24 42 75 90
26 - 28 28 - 30 30 - 32 32 - 34 34 - 36 36 - 38 38 - 40 40 - 42
79 55 36 26 19 13 9 7
Solution :
First prepare the table, where a = 25, h = 2
Mid x-a x - 25
Class Students fx d = x - 25 fd fu = fu
value (x) 2
10 - 12 3 11 33 - 14 - 42 -7 - 21
12 - 14 6 13 78 - 12 - 72 -6 - 36
14 - 16 10 15 150 - 10 - 100 -5 - 50
16 - 18 15 17 255 -8 120 -4 - 60
18 - 20 24 19 456 -6 - 144 -3 - 72
20 - 22 42 21 882 -4 - 168 -2 - 84
22 - 24 75 23 1725 -2 - 150 -1 - 75
24 - 26 90 25 2250 0 0 0 0
26 - 28 79 27 2133 2 158 1 79
28 - 30 55 29 1595 4 220 2 110
30 - 32 36 31 1116 6 216 3 108
32 - 34 21 33 858 8 208 4 104
34 - 36 19 35 665 10 190 5 95
36 - 38 13 37 481 12 156 6 78
38 - 40 9 39 351 14 126 7 63
40 - 42 7 41 287 16 112 8 295
509 - 13315 - 590 - 295
fx
i) Direct method, mean = x=
f
13315
=
509 = 26.16
fd 590
ii) Short wt method, mean = x=a+ = 25 + = 26.16
f 509
Example : 7
Following are the monthly wages of workers. Find the average monthly wages.
Wages 0 - 10 10 - 20 20 - 30 30 -40 40 - 50 50 - 60 60 – 70
Workers 5 12 30 45 50 37 21
Solution :
Work- Midpo x - 35
Wages fx d = x - 35 fd u= fu
ers (f) int (x) 10
0 - 10 5 5 25 - 30 - 150 -3 - 15
10 - 20 12 15 180 - 20 - 240 -2 - 24
20 - 30 30 25 750 - 10 - 300 -1 - 30
30 - 40 45 35 1575 0 0 0 0
40 - 50 50 45 2250 10 500 1 50
50 - 60 37 55 2035 20 740 2 74
60 - 70 21 65 1365 30 630 3 63
fx 8180
i) Direct method: mean = x= = = 40.9
f 200
fd 1180
ii) Short wt method: mean = x=a+ = 35 + 200 = 40.9
f
fu
iii) Step deviation method: mean = x = a + h
f
35 + 10 200 = 40.9
118
=
Example : 8
Solution :
f = 72, fx = 2185, fd = - 515, fu = -101
Where h = 5, d = x – a = x – 35
fx 2185
i) x= = 72 = 30.34
f
fd - 515
ii) x=a+ = 35 + = 30.34
f 72
fu
iii) a + h = 35 + 72 5 = 30.34
x=
- 101
f
Example : 9
Weight of articles 0 - 10 10 - 20 20 - 30 30 - 40 40 - 50 50 - 60
No. of articles 14 17 22 26 23 16
Solution :
f = 120, fx = , fd = 210, fu = 162
x = mean by all three methods = 31.75
3.7 Example of Median, Quartiles :
Median :
If the given values of a distribution are arranged in ascending or descending order of its
magnitude then
i) Medium = Middle item, if no. of distributions are odd
= Mean value of middle two values, if distributions are even.
Illustrative Examples
Example : 1
Example : 2
Formulae
1. N = f
N - C
4
2. Q1 = L+ f h
N - C
2
3. Q2 = L+ f h
3N - C
4
4. Q3 = L+ h
f
Where l = lower limit of class interval in which Q1| Q2 | Q3 lies
f = frequency of that respective class internal.
h = width of that respective class interval
c = cumulative frequency of the class preceding the class in which
Q1|Q2|Q3 lies.
Illustrative Examples
Example : 1
Calculate the median, lower and upper quartiles from the following data.
Class 10 - 20 20 - 30 30 - 40 40 - 50 50 – 60
Frequency 7 6 9 2 6
Solution :
Given frequency distribution table is as below.
Class Frequency (f) Cumulative frequency (c)
10 - 20 7 7
20 - 30 6 13
30 - 40 9 22
40 - 50 2 24
50 - 60 6 30
f = N = 30
To find Q1 :
N = 30
N 30
4 = 4 = 7.5 which lies in 20 – 30
L = 20, f = 6, c = cf = preceding frequency = 7
N - C
4
L + f h = 20 + 6 (10) = 14.16
7.5 - 7
Q1 =
Q1 = 14.16
To find Q2 :
N 30
2 = 15 which lies between 30 – 40
=
2
L = 30, f = 9, c = 13
N - C
2
Q2 = L+ f h
30 + 9 (10) = 32.22
15 - 13
=
Q2 = 32.22
Gigatech Publishing House
Igniting Minds
Engineering Mathematics - III 3.38 Statistics
To find Q3 :
3N 90
4 = 22.5, which lies in 40 – 50
=
4
L = 40, f = 2, c = 22
3N - C
4
h = 40 +
22.5 - 22
Q3 = L+ = 42.5
f 2
Q3 = 42.5
Example : 2
Calculate the median, lower and upper quartiles from the following distribution
Class 5 - 10 10 - 15 15 - 20 20 - 25 25 - 30 30 - 35 35 - 40 40 – 45
Frequency 5 6 15 10 5 4 2 2
Solution :
Given distribution table can be written as
Class 5 - 10 10 - 15 15 - 20 20 - 25 25 - 30 30 - 35 35 - 40 40 – 45
Frequency 5 6 15 10 5 4 2 2
Cumulative
5 11 26 36 41 45 47 49
Frequency
N = f = 49
N 49
i) = = 12.25, which lies in 15 – 20
4 4
L = 15, f = 15, c = 11
N - C 49 - 11
4 4
Q1 = L + f h = 15 + 15 (5) = 15.4
Q1 = 15.4
N 49
ii)
2 = 2 = 24.5, which lies in 15 - 20
L = 15, f = 15, c = 11
N - C
2
Q2 = L+ f h
15 + 15 (5) = 19.5
24.5 - 11
=
Q2 = 19.5
3N (3)(49)
4 = 36.75, which lies in 25 – 30
iii)
4 =
L = 25, f = 5, c = 36
3N - C
4
h = 25 +
36.75 - 36
Q3 = L+ (5) = 25.75
f 5
Q3 = 25.75
Exercise No. 1
Ex. 1 Calculate the median, lower and upper quartiles from the following data
Class 10 - 20 20 - 30 30 - 40 40 - 50 50 – 60
Frequency 8 7 10 3 7
Ex. 2 Calculate the median, lower and upper quartiles from the following table
Class 3-8 8 - 13 13 - 18 18 - 23 23 - 27 27 - 32 32 - 37 37 - 42
Frequency 12 88 58 17 23 29 18 5
Ex. 3 Calculate the median, lower and upper quartiles from the following table
Weight 70-80 80-90 90-100 100-110 110-120 120-130 130-140 140-150
No. of
12 18 35 42 50 45 20 08
Persons
10 8 1 8 1 8 80
11 5 2 10 4 20 55
12 4 3 12 9 36 48
48 - 00 - 124 432
fd
Mean =
0
i) x=a= =9+ =9
f 48
fx 432
ii) Mean =
x= = =9
f 48
fd2 (fd)2 2
124 - (0) 2 = 1.607
iii) S. D. = = - =
f ( f)2 48 (48)
Example : 2
Frequencie Midpoint x - 35
Class u=
10 fu u2 fu2
s(f) (x)
0 - 10 3 5 -3 -9 9 27
10 - 20 16 15 -2 - 32 4 64
20 - 30 26 25 -1 - 26 1 26
30 - 40 31 35 0 0 0 0
40 - 50 16 45 1 16 1 16
50 - 60 8 55 2 16 4 32
100 - - - 35 - 165
We have
2
fu2 fu
S. D. = =h -
f f
2
165 - 35
= 10
100 - 100 = 12.35
Example : 3
0 - 10 15 5 -3 - 45 9 135
10 - 20 15 15 -2 - 30 4 60
20 - 30 23 25 -1 - 23 1 23
30 - 40 22 35 0 0 0 0
40 - 50 25 45 1 25 1 25
50 - 60 10 55 2 20 4 40
60 - 70 5 65 3 15 9 45
70 - 80 10 75 4 40 16 160
125 - - 2 - 488
2 2
fu2 fu 488 2
= h - = 10
f f 125 - 125 = 19.76
Example : 4
Here
47 + 12 + 16 + 42 + 4 + 51 + 37 + 48 + 13 + 0 270
x = = 10 = 27
10
Prepare a table
x 47 12 16 42 4 51 37 48 13 0 270
d = x - 27 20 - 15 - 11 15 - 23 24 10 21 - 14 - 27 0
2
d 400 225 121 225 529 576 100 441 196 729 3542
2 2
d2 d 3542 - 0 = 18.82
= S. D.=
n -n = 10 10
Example : 5
Example : 6
Here
47 + 12 + 16 + 42 + 4 + 51 + 37 + 48 + 13 + 0 270
x = = 10 = 27
10
Prepare a table
x 47 12 16 42 4 51 37 48 13 0 270
d = x - 27 20 - 15 - 11 15 - 23 24 10 21 - 14 - 27 0
d2 400 225 121 225 529 576 100 441 196 729 3542
2 2
d2 d 3542 - 0 = 18.82
= S. D.=
n -n = 10 10
Exercise No. 2
Ex. 1 Calculate the standard deviation and mean for following data.
Marks 0 - 10 10 - 20 20 - 30 30 - 40 40 - 50 50 60 60 – 70
Students 5 12 30 45 50 37 21
x - 35
[Hint : N = f = 200, a = 35, u = 10 , fu = 118, fu2 = 510
x = 40.9, = S. D = 10.839]
Ex. 2 The annual salaries of a group of employees are given below.
Salaries (1000) 45 50 55 60 65 70 75 80
Persons 3 5 8 7 9 7 4 7
x - 60
[Hint: N = f = 50, a = 60, h = 5, u = , fu = 36, fu2 = 240
5
2
fu2 fu
= h - = 10.35]
f f
Ex. 3 Calculate the standard deviation for the following data
Class 100-109 110-119 120-129 130-139 140-149 150-159 160-169 170-179
= 12.35]
Ex. 5 Calculate the standard deviation for the following data
Heights 20 - 25 25 - 30 30 - 35 35 - 40 40 - 45 45 – 50
No. of Girls 170 110 80 45 40 35
[Hint: N = f = 480, a = 35.5, h = 5, fu = - 220, fu = 1310 2
= 7.936]
Example : 9
Employees 2 2 1 30 12 1 1 1
x - 60
5 , f = N = 50, fu = 10, fu = 68
2
[Hint: a = 60, h = 5, u =
2 2
fx2 fu 68 10
=h - =5 - = 5 1.32 = 5.74
f f 50 50
3.10 Examples on Coefficient of Variation :
If and
x respectively denotes the standard deviation and mean of data A then
coefficient of variation of A = cov (A) = 100
x
Note :
1. If. (mean of data A) > (mean of data B)
Then team A is more run taker
2. If. cov (A) > cov (B)
Then i) Group B is more consistent
ii) Group A has more variability
Illustrative Examples
Example : 1
Two brands of types are tested with following results of their life in kilometers
Life in kms (000) 20 - 25 25 - 30 30 - 35 35 - 40 40 – 45
Brand A 1 22 64 10 3
Brand B 0 24 76 0 0
Determine:
(i) Which brand tyres have greater average life
(ii) Which brand tyres will be preferred for use
Solution :
First we should find mean and S. D. fir both brands
Brand A :
a+
fu h = 32.5 + - 8 5 = 32.1
xA = 100
f
2
fu2 fu
A = h -
f f
2
= 5 48 - - 8 = 3.441
100 100
A 3.441
cov (A) = 100 = 32.1 100 = 10.72
x A
For Brand B :
Frequency Midpoint x - 32.5
Life u=
5 fu u2 fu2
(f) (x)
20 - 25 0 22.5 -2 0 4 0
25 - 30 24 27.5 -1 - 24 1 24
30 - 35 76 32.5 0 0 0 0
35 - 40 0 37.5 1 0 1 0
40 - 45 0 42.5 2 0 4 0
- 100 - - - 24 0 24
fu
a + h = 32.5 + 100 5 = 31.3
- 24
xB =
f
2
fu2 fu
B = h -
f f
2
= 5 24 - - 24 = 2.136
100 100
Gigatech Publishing House
Igniting Minds
Engineering Mathematics - III 3.46 Statistics
B 2.136
cov (B) = 100 = 31.3 100 = 6.824
x B
Conclusion :
i) xA >
x B Brand A tyres have more life
ii) cov (A) > cov (B) Brand B tyres will be preferred to a
Example : 2
Played by A 15 12 07 06 05 03
Played by B 18 12 06 03 02 01
Solution :
For team A,
15 + 12 + 7 + 6 + 5 + 3 48
xA = 6 = 6 =8
For team B,
18 + 12 + 6 + 3 + 2 + 1 42
xB = = =7
6 6
Table
A (x) d=x-8 d2 B (y) D=y-7 D2
15 7 49 18 11 121
12 4 16 12 5 25
7 -1 1 6 -1 1
6 -2 4 3 -4 16
5 -3 9 2 -5 25
3 -5 25 1 -6 36
48 0 104 42 0 224
Team A :
xA = 8
2
D2 D 104 - 0 = 13
A =
N -N = 48 6 = 2.1666 = 1.471
A 1.471
xA 100 = 8 100 = 18.39
cov (A) =
Team B :
Gigatech Publishing House
Igniting Minds
Engineering Mathematics - III 3.47 Statistics
xB = 7
2
D2 D 224 - 0 = 112
B =
N -N = 42 21 = 5.333 = 2.309
B 2.309
cov (B) = 100 = 7 100 = 32.99
x B
Team A Played 15 10 07 05 03 02
Team B Played 20 10 05 04 02 01
Calculate the coefficient of variation and state which team is more consistent
Solution :
Hint :
x = 7, x = 7, = 4.43, = 6.48
A B A B
Example : 4
Example : 5
Find standard deviation and coefficient of variance for the following data.
Marks 0 - 10 10 - 20 20 - 30 30 - 40 40 - 50 50 - 60 60 - 70 70 – 80
Students 12 18 35 42 50 45 20 08
= 17.26
17.26
cov (A) = 40.43 100 = 100 = 42.69]
x
Example : 6
Scores obtained by two batsmen A and B in 10 matches are given below. State, which batsman
has good average of runs and which player is more consistent.
A 30 66 60 34 20 38 44 62 80 46
B 34 70 55 48 45 30 46 38 60 34
Solution :
Player A : xA = 48, A = 17.776, cov (A) = 37.03
Player B : xB = 46, B = 12.107, cov (B) = 26.32
(i) Batsman A is more run taker
(ii) Player B is more consistent.
Gigatech Publishing House
Igniting Minds
Engineering Mathematics - III 3.49 Statistics
Example : 8
Following table gives the marks obtained in a paper of mathematics out of 50 by the
students of D2 divisions A and B.
Class 0-5 5 - 10 10 - 15 15 - 20 20 - 25 25 - 30 30 - 35 30 - 40 40 - 45 45 - 50
A 2 6 8 8 15 18 12 11 9 4
B 3 5 7 9 12 16 11 5 6 2
State, which Batch has more variability
Solution :
Division A:
x = 26.854, = 11.173, cov (A) = 41.60
A A
Division B:
x B = 24.934, B = 10.927, cov (B) = 43.824
Team B has greater variability and team A is more consistent.
Example : 9
Find the mean, standard deviation and coefficient of variation for the following data.
Marks obtained up to 10 20 30 40 50 60 70 80
mean =
x = 40: 43, = 17.26, cov (A) = 42.69
3.11 Examples on Combined Mean :
Let
x1 and 1 be the mean and standard deviation of a sample size of n1 and
x2, 2 be the
mean and standard deviation of a sample of size n2.
Then
n1
x1 + n2
i)
x2
x12 = combined mean of two samples =
n1 + n2
n1
x1 + n2
x2 + n3
ii)
x2
x13 =
n +n +n 1 2 3
2 2 2 2
n1 1 + n2 2 + n1d 1 + n2d 2
iii) 12 = combined standard deviation =
n1 + n2
Where d1 = |
x1 –
x12 |, d2 = |
x2 –
x 12 |
Gigatech Publishing House
Igniting Minds
Engineering Mathematics - III 3.50 Statistics
Example : 1
An analysis of monthly wages paid to workers is two firms A and B, belonging to same
domain of production, is as below.
A B
No. of Workers 500 600
Average salary 3000 3500
S. D. of distribution of wages 88 120
Determine :
i) Which firm pays larger on salaries
ii) Which firm has more consistency
iii) Which firm has greater variability
iv) What is average of salary if A and B taken together
v) What is S. D. of individual worker if A and B taken together
Solution :
i) Total monthly payment of A = 500 3000 = 15.00000
Total monthly payment of B = 600 3500 = 21,00,000
Firm B pays larger amount on salaries.
ii) xA = 3000, xB = 3500, A = 88, B = 120
A 88
cov (A) = x 100 = 3000 100 = 2.93
A
B 120
cov (B) = 100 = 100 = 3.42
xB 3500
cov (A) < cov (B)
Firm B has greater variability and
Firm A is more consistent
iii) Average salary = combined mean of (A and B)
n1x1 + n2
x2
x12 =
n +n 1 2
500(3000) + 600(3500)
=
500 + 600
1500000 + 2100000 3600000
= = 1100 = 3272.70 3273
1100
iv) d1 = |
x1 –
x 12 | = | 3000 – 3273 | = 273
d =|x -
2
2 x | = | 3500 – 3273 | = 227
12
2 2 2 2
n1 1 + n2 2 + n1 1 + n2 2
Combined S. D = 12 =
n1 + n2
(560) (88)2 + (600) (120)2 + (500) (273)2 + 600(227)2
=
500 + 600
(500 7744) + (600 14400) + (500) (74529) + (600) (51529)
=
1100
80693900
=
1100
= 73358.09
= 270.84
Example : 2
n1
x 1 + n2
x 2 (500) (186) + (600) (175)
ii) x12 = n +n = = 180
1 2 500 + 600
d1 = |
x – 180 | = | 175 – 180 | = 5
d =|
2 y - 180 | = | 186 – 180 | = 6
2 2 2 2
n1 1 + n2 2 + n1 1 + n2 2
Combined S. D = 12 =
n1 + n2
60000 + 40500 + 15000 + 18000
= 1100
= 11.02
Example : 3
n1
x1 + n2x2
Combined mean = x12 =
n1 + n2
= 1422.92
d1 = 27. 08, d2 = 22.92
2 2 2 2
n1d 1 + n2d 2 + n1 1 + n2 2
12 = n1 + n2
= 12578
Example : 4
The number of employees wages per employee and variance of the wages per employee
are given below for two factories A and B.
A B
No. of employees 100 150
Average wages of employee 3200 2800
Variance of wages 625 729
r = N (xi -
1
= x)r , for ungrouped data
fi (xi -
1
= x)r , for frequency distribution
N
II) rth moment about any value ‶a″
1
= r = N fi dir, di = xi – a
r 1 xi - a
= h N fi uir, where ui = h
Note :
(xi -
x)r fi (xi -
x)r
1. r = N = N
fi (xi -
x)o fi fi
2. o = = N = = 1 o = 1
N fi
fi (xi -
x)1 fixi fi
x
3. 1 = = N - N
N
fixi fi N
=
N - N x = x - N x = x - x = 0
1 = 0
4. o = 1
Gigatech Publishing House
Igniting Minds
Engineering Mathematics - III 3.54 Statistics
b) ʹ1 =
x–A
c) ʹ2 = 2 + (ʹ1)2
d) ʹ3 = 3 + 32 ʹ1 + ʹ13
e) ʹ4 = 4 + 3 ʹ1 + 62 (ʹ1)2 + (ʹ1)4
Illustrative Examples
Example : 1
The first 04 moments about ‶4″ of variables are – 1.5, 17, - 30, 108. Find the first 04
moments about mean and hence , ,
x, , variance
1 2
Solution :
Given a = 4, ʹ1 = - 1.5, ʹ2 = 17, ʹ3 = - 30, ʹ4 = 108
1 = 0
2 = ʹ2 - (ʹ1)2 = 17 – (- 1.5)2 = 14.75
3 = ʹ3– 3 ʹ2 ʹ1 + 2(ʹ1)3 = 39.75
4 = ʹ4– 4 ʹ3 ʹ1 + 6ʹ2(ʹ1)2 – 3 (ʹ1)4 = 142.3125
32 (39.75)2
1 = 3 =
2 (14.75)3 = 0.4926
4 142.31
2 = = 2 = 0.6543
22 (14.75)
x = a + ʹ1 = 4 + ( - 1.5) = 2.5
= 2 = 14.75 = 3.84
Variance = = 2 = 14.75
2
Example : 2
The first 04 moments about working mean ‶44.5″ of a distribution are – 0.4, 2.99, - 0.08,
27.63. Calculate the moments about mean, 1, 2, mean, S. D. variance.
Solution :
1 = 0, 2 = 2.83, 3 =3.38, 4 =30.295, 1 = 0.504, 2 = 3.782,
Mean
x = a + ʹ1 = 44.5 + (- 0.4) = 44.1, 2 = variance = 2 = 2.83, = 1.682
Example : 3
Find coefficient of skewness, kurtosis if the first 04 moment about, ‶5″ are 2, 20, 40, 50.
And hence find,
x , , variance.
Solution :
1 = 0, 2 = 16, 3 = 64, 4 = 162,
x = 7, = 2 = 4
Variance = = 4, 1 = 1, 2 = 0.6328
2
Example : 4
Solution :
x = a + ʹ1 = 3, 1 = 0, 2 = 1.5, 3 = 0, 4 = 6
32 0
= 2 = 1.5 = 1.224, 2 = 1.5, 1 = = 3 =0
23 (7.5)
4 6 6
2 = = 2 =
22 (1.5) 2.25 = 2.66
Example : 5
The first 03 moments about ‶2″ of a distribution are 1, 16, - 40. Find first 03 moments
about mean and hence find
x, , , 2
1
The first 04 moments about working mean ‶30.5″ of a distribution are 0.0375, 0.4546,
0.0609, 0.5074. Calculate the moments about mean and hence , , ,
1x, .2
Solution :
1 = 0, 2 = 0.3139, 3 = 0.0098, 4 = 0.3033, 1 = 1.1143 10-4
2 = 0.3349, x = 30.5 + 0.0375 = 30.5375, = 0.3139 = 0.5602.
Example : 7
The first 04 moments about the working mean (44.5) of a distribution are – 0.4, 2.99, -
0.08 and 27.63. Calculate the moments about mean. , ,
x. 1 2
Find the coefficient of skewness, kurtosis about a point ‶48″ if d = x – 48, f = 100,
fd = 50, fd2 = 1970, fd3 = 2948 fd4 = 86752.
Solution :
fd
We know (i) ʹ1 = = 0.5
f
fd2
(ii) ʹ2 = = 19.7
f
fd3
(iii) ʹ3 = = 29.48
f
fd4
(iv) ʹ4 = = 867.52
f
1 = 0, 2 = 19.45, 3 = 0.18, 4 = 837.92,
32 (0.18)2 0.0324
1 = = 3= = 4.40 10-6
23 (19.45) 7357.98
4 837.92
2 = = 2 = 2.21
22 (19.45)
Example : 9
Calculate the first 04 moments about the mean and also find the values of 1, 2. From the
following distribution.
Marks 0 - 10 10 - 20 20 - 30 30 - 40 40 - 50 50 - 60 60 – 70
Students 8 12 20 30 15 10 5
Solution :
First we prepare the table with assumed mean a = 35
Midpoin
Marks f u fu u2 fu2 u3 fu3 u4 fu4
t (x)
0 - 10 5 8 -3 - 24 9 72 - 27 - 216 81 648
10 - 20 15 12 -2 - 24 4 48 -8 - 96 16 192
20 - 30 25 20 -1 - 20 1 20 -1 - 20 1 20
30 - 40 35 30 0 0 0 0 0 0 0 0
40 - 50 45 15 1 15 1 15 1 15 1 15
50 - 60 55 10 2 20 4 40 8 80 16 160
60 - 70 65 05 3 15 9 45 27 135 81 405
- 100 - - 18 - 240 - - 102 - 1440
fu - 18
ʹ1 = h = 10 = - 18
f 100
fu = 240,
2
2
ʹ2 = f
h
fu = - 1020,
3
3
ʹ3 = f
h
fu = 14400
4
4
ʹ4 = f
h
1 = 0, 2 = 236.76, 3 = 264.336, 4 = 141290.11,
1 = 0.005, 2 = 2.52.
Example : 10
From the following data, calculate the moments about (i) assumed mean ‶25″ (ii) actual mean.
Class 0 - 10 10 - 20 20 - 30 30 - 40
Frequency 1 3 4 2
Solution :
x - 25
Prepare a table with h = 10, a = 25, u = 10 , f = 10,
f 1 3 7 3 1
Solution :
2 + 9 + 28 + 25 + 6 fu 60
We have x = = = = 4 = actual mean
15 f 15
Prepare a table
x f u=x-4 fu u2 fu2 u3 fu3 u4 fu4
2 1 -2 -2 4 4 -8 -8 16 16
3 3 -1 -3 1 3 -1 -3 1 3
4 7 0 0 0 0 0 0 0 0
5 3 1 3 1 3 1 3 1 3
6 1 2 2 4 4 8 8 16 16
- 15 - 0 - 14 - 0 - 38
Solution :
We have h = 1
fu
1 = h =0
f
fu = 0.933,
2
2
2 = f
h
fu = 0,
3
3
3 = f
h
fu = 2.533
4
4
4 = f
h
= 2 = 0.933 = 0.966,
1 = 0, 2 = 2.91
3.13 Examples on Correlation :
1. The distribution for one variate x is called univariate distribution.
2. The distribution involving more than one variate is bivariate distribution.
3. If the change in one vanate x affects the change in other vaniate y then x, y are called
correlated.
4. If increase in x increases y then correlation is positive.
5. If decrease in x decreases y then correlation is positive.
6. If increase (decrease) in x decreases (increases) y then correlation is negative.
Gigatech Publishing House
Igniting Minds
Engineering Mathematics - III 3.59 Statistics
y
7. If x and y are only two variates and if x = constant then correlation is called as linear or
perfect otherwise it is called as nonlinear correlation.
8. The formula for measure the intensity or the degree of linear relationship between two
variates (variable) was developed by Karl Pearson and called as correlation coefficient.
cov (x,y)
9. Correlation coefficient of x, y = r = r (x, y) =
x y
xy x y
Where cov (x, y) = covariance of x, y = n - n
n
nxy - (x) (y)
Correlation coefficient = r = r (x, y) =
nx - (x)2 ny2 - (y)2
2
Illustrative Examples
Example : 1
Calculate the correlation coefficient between x and y from the following data.
x 78 89 99 60 59 79 68 61
y 125 137 156 112 107 136 123 108
Solution : Firstly we prepare the table
x x = x - 69 x2 y y = y - 112 y2 xy
78 9 81 125 13 169 117
89 20 400 137 25 625 500
99 30 900 156 44 1936 1320
60 -9 81 112 0 0 0
59 - 10 100 107 -5 25 50
79 10 100 136 24 576 240
68 -1 1 123 11 121 - 11
61 -8 64 108 -4 16 32
41 1727 - 108 3468 2248
Gigatech Publishing House
Igniting Minds
Engineering Mathematics - III 3.60 Statistics
We have n = 8
nxy - (x) (y)
r =
nx2 - (x)2 ny2 - (y)2
8(2248) - (41) (108)
=
8(1727) - (41)2 8(3468) - (108)2
13556
=
13968.95 = 0.97
Note:
(Here 69 112 are selected randomly to reduce the calculations by reducing given variates in
size)
Example : 2
8 16 64 256 128
7 14 49 196 98
6 13 36 169 78
5 11 25 121 55
4 12 16 144 48
3 10 9 100 30
2 8 4 64 16
1 9 1 81 9
Following data includes the production data of two items x and y year wise. Calculate
the correlation coefficient of x and y.
Year 2002 2003 2004 2005 2006 2007 2008 2009
x 100 102 104 107 105 112 103 99
y 15 12 13 11 12 12 19 26
Solution :
We have
x =
100 + 102 + 104 + 107 + 105 + 112 + 103 + 99 832
= = 104
8 8
y =
15 + 12 + 13 + 11 + 12 + 12 + 19 + 26 120
= 8 = 15
8
We prepare the table firstly
x x = x - 104 x2 y y = y - 15 y2 xy
100 -4 16 15 0 0 0
102 -2 4 12 -3 9 6
104 0 0 13 -2 4 0
107 3 9 11 -4 16 - 12
105 1 1 12 -3 9 -3
112 8 64 12 -3 9 - 24
103 -1 1 19 4 16 -4
99 -5 25 26 11 121 - 55
0 120 120 0 184 - 92
nxy - (x) (y)
r =
nx2 - (x)2 ny2 - (y)2
8(- 92) - 0
=
8(120) - 0 8(184) - 0
- 92
= = - 0.619
120 184
Example : 4
Exercise No. 3
Example : 1
Where
x= mean value of x, y = mean value of y
byx = r = regression coefficient of y on x
y
x
and r is the correlation coefficient of x and y
(ii) Line of regression of x on y :
x
x-x = r (y - y)
y
x-x = b (y - y)
xy
x = (bxy) y + k
r x = regression coefficient of y on x
Where bxy =
y
Note :
x y 2
1. (bxy) (byx) = r r r
y x
r2 = (bxy) (byx)
2. The correlation coefficient r is geometric mean of two regression coefficients.
3. The point – (
x,
y) always lies on lines of regression.
byx = r
x = (r) y = cov (x, y) y = cov (x, y)
4.
y x x y x (x )2
r x cov (x, y) x cov (x, y)
5. bxy = . =
y x y y (y )2
Illustrative Examples
Example : 1
Example : 2
Example : 3
Find the correlation coefficient between x and y if two lines of regression are
2x – 9y + 6 = 0, x – 2y + 1 = 0
Solution :
Let the line of regression of x on y is 2x – 9y + 6 = 0
x = 2 y – 3
9
9
bxy = 2
= 9 1 = 9
2 2 4
Gigatech Publishing House
Igniting Minds
Engineering Mathematics - III 3.66 Statistics
3
r = + 2 , > 1, which is wrong as – 1 r 1
r2 = (bxy) (byx)
(2) 9 = 9
2 4
=
2
r = +3
Example : 5
If
x = 36,
y = 85, x = 11, y = 8, r = 0.66
Then find the lines of regression of x on y and y on x.
Solution :
1. Line x on y is x-
x =
x
r (y - y)
y
(0.66) 8 (y – 85)
11
x – 36 =
x = - 41.1375 + 0.9075 y
y
2. Line of y on x is y -
y = r (x -
x)
x
(0.66) 11 (x – 36)
8
y – 85 =
y = 67.72 + 0.48 x
Example : 6
y 9 8 10 12 11 13 14 16 15
Solution :
x =
1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 + 9 45
= 9 = 5,
9
y =
9 + 8 + 10 + 12 + 11 + 13 + 14 + 16 + 15 108
= 9 = 12
9
Construct a table
x x=x-5 x2 y y = y - 12 y2 xy
1 -4 16 9 -3 09 12
2 -3 9 8 -4 16 12
3 -2 4 10 -2 4 4
4 -1 1 12 0 0 0
5 0 0 11 -1 1 0
6 1 1 13 1 1 1
7 2 4 14 2 4 8
8 3 9 16 4 16 12
9 4 16 15 3 9 12
- 0 60 - 0 60 57
nxy - (x) (y) 9(57) 57
i) We know r = 2 = = = 0.9
2 2 2
nx - (x) ny - (y) 9(60) 9(60) 60
y2 y2 60 60
ii) x =
n -n = 9 -0= 9
y2 y 2 60
iii) y =
n -n = 9
iv) Regression line of y on x is
y
y- y = r (x -
x)
x
6019
y – 12 = (0.95) (x – 5)
6019
y = (0.95) x + 7.25
v) Regression line of x on y is
x
x- x = r (y -
y)
y
Gigatech Publishing House
Igniting Minds
Engineering Mathematics - III 3.68 Statistics
6019
x-5 = (0.95) (y – 12)
6019
x = (0.95) y – 6.4
vi) y(6.2) = (0.95) (6.2) + 7.25 = 13.14
Example : 7
y (20) = 1.1
x (10) = 16.4 – 13
x (10) = 3.4
Example : 8
Calculate the regression lines of x on y and y on x if following table gives the scores in
aptitude test and productivity of 10 workers selected at random.
Score 60 62 65 70 72 48 53 73 65 82
Productivity 68 60 62 80 85 40 52 62 60 81
Solution :
We have
x =
650 650
10 = 65, y = 10 = 65
Let, x = x – 65, y = y – 65, x = 0, y = 0, x2 = 894, y2 = 1752, xy = 1044
x xy x xy 1044
bxy = r = = = = 0.596
y x y y y2 1752
y xy 1044
byx = r = = = 1.168
y x2 894
Regression of line x on y is x = 26.26 + 0.596 y
Regression of line y on x is y = - 10.92 + 1.168
Example : 9
Regression of y on x is y-
y = byx (x -
x)
y - 51.57 = (0.942) (x – 48.29)
y = (0.942)x + 6.08
y (x = 70) = (0.942) (70) + 6.08 = 65.94 + 6.08 = 7202 = 72.02
Exercise No. 4
Ex. 1 Find
x,
y and r from the following lines of regression.
i) y = 0.516x + 33.73, x = 0.512y + 32.52
[Ans: r = 0.514]
ii) y = 0.516x + 33.73, x = 0.512y + 32.52
[Ans: x = 67.6,
y = 68.61]
Gigatech Publishing House
Igniting Minds
Engineering Mathematics - III 3.70 Statistics
[Ans:
x = - 29,
3 15 3
y = 19 , r = 9 ]
Ex. 2 If two regression coefficient are 0.8, 0.4, what would be the correlation coefficient
[Hint: r2 = (bxy) (byx) = (0.8) (0.4) = 0.32 r = + 0.56]
3.15 Examples on least square method for curve fitting :
1. The normal equations for fitting the line y = ax + b for the data having ‶n″ no. of
observations are
y = a x + nb
xy = a x2 + bx
2. The normal equations for fitting the parabolic curve
y = ax2 + bx + e, for the data having ‶n″ observations are
y = a x2 + bx + ne
xy = a x3 + bx2 + cx
x2y = ax4 + bx3 + cx2
Example : 1
0-1 1 1 -1
0 3 0 0
1 5 1 5
2 7 4 14
3 9 9 27
4 11 16 44
6 15 36 90
15 51 67 179
We have n = 7
Normal equations are
y = ax + nb
xy = ax2 + bx
51 = 159 + 7b
179 = 679 + 15b
225 a + 105 b = 765
469 a + 105 b = 1253
2449 = 488
a = 2
Secondly 7b = 51 – 15a
= 51 – 30
= 21
b = 3
Required line is y = 2x + 3
Example : 2
x y =
2
ax2 + bx3 + cx3
Construct a table
x y x2 x3 x4 xy x2y
-2 -1 4 -8 16 2 -4
-1 2 1 -1 1 -2 2
0 3 0 0 0 0 0
1 2 1 1 1 2 2
2 -1 4 8 16 -2 -4
3 -6 9 27 81 - 18 - 54
=3 -1 19 27 115 - 18 - 58
We get 6a + 3b + 19c = -1
3a + 19b + 27c = - 18
19a + 27b + 115c = - 58
Solving we get, a = 3, b = 0, c = - 1
Required parabola is y = 3 - x2
Exercise No. 5
Example : 1
x 0 1 3 6 8
y 1 3 2 5 4
Ans. : y = 1.6 + (0.38)x
Example : 4
Example : 6
6 5 36 30
7 5 49 35
7 4 49 28
8 5 64 40
8 4 64 32
8 3 64 24
9 4 81 36
9 3 81 27
10 3 100 30
Example : 4
Where
x = 2.5, h = 0.5
x - 2.5
X =
0.5
So the parabola of fit y = a + bx + cx2 becomes
y = a + bx + cx2 …(i)
The normal equations are yi = na + bxi + cxi2
xi yi = axi + bxi2 + cxi3
xi2 yi = axi2 + bxi3 + cxi4 …(ii)
The values of xi, yi etc. are calculated below :
x - 2.5
x y X=
0.5 x2 x3 x4 xy x2y
a + b 0.5 + c 0.5
x - 2.5 x - 2.5 2
y =
y = a + b (2x – 5) + c (2x – 5)2
y = 2.07 + 0.511 (2x – 5) + 0.061 (2x – 5)2
y = 2.07 + 1.022x – 2.555 + 0.244x2 – 1.22x + 1.525
y = 1.04 – 0.198x + 0.244x2
which is required equation of parabola.
Example : 5
Where
x = 2, h = 1
X = x-2
So the parabola of fit y = a + bx + cx2 becomes
y = a + bx + cx2 …(i)
The normal equations are
yi = na + bxi + cxi2
xi yi = axi + bxi2 + cxi3
xi2 yi = axi2 + bxi3 + cxi4 …(ii)
The values of xi, yi etc. are calculated below.
x y X=x-2 x2 x3 x4 xy x2y
0 1 -2 4 -8 16 -2 4
1 1.8 -1 1 -1 1 - 1.8 1.8
2 1.3 0 0 0 0 0 0
3 2.5 1 1 1 1 2.5 2.5
4 6.3 2 4 8 16 12.6 25.2
yi = 12.9 xi = 0 xi2 = 10 xi3 = 0 xi4 = 34 xi yi = 11.3 xi2yi =
33.5
The equations (ii) become 12.9 = 5a + 10c …(iii)
11.3 = 10b …(iv)
33.5 = 10a + 34c …(v)
From (iv), b = 1.13
Multiplying (iii) by 2 and subtracting from (v)
we get, a = 1.48, c = 0.55
Equation (i) becomes,
y = a + b (x – 2) + c (x – 2)2
y = 1.48 + 1.13 (x – 2) + 0.55 (x – 2)2
y = 1.48 + 1.13x – 2.26 + 0.55x2 – 2.2x + 2.2
y = 1.42 – 1.07x + 0.55x2
which is required equation of parabola.
Example : 6
-2 4 4 -8 16 -8 16
-1 1 1 -1 1 -1 1
0 2 0 0 0 0 0
1 7 1 1 1 7 7
2 15 4 8 16 30 60
3 30 9 27 81 90 270
xi = 0 yi = 71 xi2 = 28 xi3 =0 xi4 = 196 xi yi = 82 xi2yi = 462
The equations (ii) become 71 = 7c + 28a …(iii)
82 = 28b …(iv)
462 = 28c + 196a …(v)
From (iv), b = 2.92
Multiplying (iii) by 4 and subtracting from (v)
we get, a = 2.14, c = 1.52
a = 2.14, b = 2.92, c = 1.52
Equation (i) becomes,
y = 2.14x2 + 2.92x + 1.52
which is required equation of parabola.
Example : 7
3 11 9 33
xi = 6 yi = 26 xi2 = 14 xi yi = 54
The equations (i) become 26 = 4a + 6b
and 54 = 6a + 14b
i.e. 13 = 2a + 3b …(ii)
27 = 3a + 7b …(iii)
Multiplying (ii) by 3 and (iii) by 2, then subtracting (ii) from (iii)
we get, a = 2, and b = 3
Hence the required line of best fit is
y = 2 + 3x
Example : 8
Fit a straight line of the type y = a + bx to the following data. Nov – 2017
x: 0 5 10 15 20 25
y: 12 15 17 22 24 30
Solution :
Let the straight line be y = a + bx
Then the normal equations are yi = 6a + bxi
xi yi = axi + bxi2 …(i)
The values of xi, yi etc. are calculated below :
x y x2 xy
0 12 0 0
5 15 25 75
10 17 100 170
15 22 225 330
20 24 400 480
25 30 625 750
xi = 75 yi = 120 xi2 = 1375 xi yi = 1805
The equations (i) become 120 = 6a + 75b
and 1805 = 75a + 1375b
i.e. 40 = 2a + 25b …(ii)
361 = 15a + 275b …(iii)
Gigatech Publishing House
Igniting Minds
Engineering Mathematics - III 3.82 Statistics
Where
x = 2, h = 1
Example : 10
Where
x = 2, h = 1
x = x–2
So the parabola of fit y = a + bx + cx2 becomes
y = a + bx + cx2 …(i)
The normal equations are
yi = na + bxi + cxi2
xi yi = axi + bxi2 + cxi3
xi2 yi = axi2 + bxi3 + cxi4 …(ii)
The values of xi, yi etc. are calculated below.
x y X=x-2 x2 x3 x4 xy x2y
0 1 -2 4 -8 16 -2 4
1 2 -1 1 -1 1 -2 2
2 4 0 0 0 0 0 0
3 8 1 1 1 1 8 8
4 16 2 4 8 16 32 64
yi = 31 xi = 0 xi2 = 10 xi3 =0 xi4 = 34 xi yi = 36 xi2yi = 78
The equations (ii) become 31 = 5a + 10c …(iii)
36 = 10b …(iv)
78 = 10a + 34c …(v)
From (iv), b = 3.6
x = x – 2, (h = 1)
Let, x = x- 2 so that the parabola of fit y = a + bx + cx2
becomes y = a + bx + cx2 …(i)
The normal equations are
yi = na + bxi + cxi2
xi yi = axi + bxi2 + cxi3
xi2 yi = axi2 + bxi3 + cxi4 …(ii)
The values of xi, yi etc. are calculated below.
x y X=x-2 x2 x3 x4 xy x2y
0 1 -2 4 -8 16 -2 4
1 0 -1 1 -1 1 0 0
2 3 0 0 0 0 0 0
3 10 1 1 1 1 10 10
4 21 2 4 8 16 42 84
yi = 35 xi = 0 xi2 = 10 xi3 = 0 xi4 = 34 xi yi = 50 xi2yi = 98
The equations (ii) become 35 = 5a + 10c …(iii)
50 = 10b …(iv)
98 = 10a + 34c …(v)
From (iv), b = 5
Multiplying (iii) by 2 and subtracting from (v)
Gigatech Publishing House
Igniting Minds
Engineering Mathematics - III 3.84 Statistics
we get, a = 3, c = 2
Equation (i) becomes,
y = a + b (x – 2) + c (x – 2)2
y = 3 + 5 (x – 2) + 2 (x – 2)2
y = 3 + 5x – 10 + 2x2 – 8x + 8
y = 1 – 3x + 2x2
which is required equation of parabola.
Multiplying (iii) by 2 and subtracting from (v)
we get, c = 1.142, a = 3.916
Equation (i) becomes,
y = a + b (x – 2) + c (x – 2)2
y = 3.916 + 3.6 (x – 2) + 1.142 (x – 2)2
y = 3.916 + 3.6x – 7.2 + 1.142x2 – 4.568x + 4.568
y = 1.284 – 0.968x + 1.142x2
which is required equation of parabola.
Exercise No. 6
1. Fit a straight line to the following data
x: 0 1 2 3 4
y: 1 1.8 3.3 4.5 6.3
Ans. : y = 0.72 + 1.33x
2. Find the best values of a, b assuming that the following values of x, y are connected by
the relation y = a + bx
x: 0 1 2 3 4
y: 1 2.9 4.8 6.7 8.6
Ans. : a = 1, b = 1.9
3. S. T. the line of fit to the following data is given by y = 3.9 + 1.5x.
x: 1 2 3 4 5
y: 5 7 9 10 11
4. Obtain the least squares line fit to the following data.
x: 0 6 8 10 14 16 18 20
y: 3 12 15 18 24 27 30 33
Ans. : y = 1.5a + 3
8. By the method of least square, find the straight line that best fits the following data.
May – 2017
x: 1 2 3 4 5
y: 14 27 40 55 68
Ans. : y = 13.6x
9. Fit a straight line to the following data by least square method.
x: 1 2 3 4 5 6
y: 6 4 3 5 4 2
Ans. : y = 5.7999 – 0.514x
10. Fit a second degree parabola for the following data.
x: 1 2 3 4
y: 1.7 1.8 2.3 3.2
Ans. : y = 2 – 0.5x + 0.2x2
11. Fit a least square straight line to the following data.
Gigatech Publishing House
Igniting Minds
Engineering Mathematics - III 3.86 Statistics
x: 2 7 9 1 5 12
y: 13 21 23 14 15 21
Ans. : y = 12.45 + 0.8977x
12. Fit a least square straight line to the following data.
x: 2 4 6 8 10 12
y: 1.8 1.5 1.4 1.1 1.1 0.9
Ans. : y = 1.9 – 0.086x
13. Fit a least square straight line to the following data.
x: 20 60 100 140 180 220 260 300 340 380
y: 0.18 0.37 0.35 0.78 0.56 0.75 1.18 1.36 1.17 1.65
Ans. : y = 0.069 + 0.0038x
14. Fit a straight line to the following data by least square method.
x: 1 3 4 6 8 9 11 14
y: 1 2 4 4 5 7 8 9
Ans. : y = 0.5454 + 0.6363x
15. In a study between the amount of rainfall and the quantity of air pollution removed the
following data were collected.
Daily Rainfall in 0.01 cm(x) 4.3 4.5 5.9 5.6 6.1 5.2 3.8 2.1
Pollution Removed (mg/m3)(y) 12.6 12.1 11.6 11.8 11.4 11.8 13.2 14.1
Obtain by the method of least square, a relation of the form y = a + bx which best fits to
these observations.
Ans. : y = 15.49 – 0.675x
16. If a curve of the form x = ay2 + by + c satisfies the data.
x: -6 -8 -4 6 22 44 72
y: 0 1 2 3 4 5 6
Find the best values of a, b, c.
Ans. : a = 3, b = - 5, c = - 6
17. Fit a parabola of the form y = ax2 + by + c to the following data using least square
criteria.
x: 1 2 3 4 5 6 7
y: -5 -2 5 16 31 50 73
Ans. : y = 2x2 – 3x - 4
18. Values of x and y are tabulated as under.
x: 1 1.5 2.0 2.5
y: 25 56.2 100 156
Find the law of the form x = ayn to satisfy the given by data.
[Hint : x = ayn, Taking logarithm we get
log x = log a + n log y
x = c + nY, where log x = X, log a = C, log y = Y
which is a straight line equation].
Ans. : n = 0.5, c = - 0.6988, a = 0.2, x = 0.2 y0.5
19. Find the best values of a, b, c assuming that the following values of x, y are connected
by the relation y = ax2 + bx + c
x: 1 2 3 4 5
y: 3.38 8.25 16.6 28.5 44
Ans. : a = 1.772, b = - 0.383, c = 2.103
20. Fit a parabola y = a + bx + cx2 to the following data.
x: 1 2 3 4 5 6 7 8 9
y: 2 6 7 8 10 11 11 10 9
Ans. : y = 7.4 + 0.85x + 0.1232x2
Illustrative Examples
Example : 1
x y xy x2
0 1 0 0
1 1.8 1.8 1
2 3.3 6.6 4
3 4.5 13.5 9
4 6.3 25.2 16
10 16.9 47.1 30
Here n = 5
16.9 = 5 a + 10 b …(i)
47.1 = 10 a + 30 b …(ii)
Solving (i) and (ii) we get a = 0.72 b = 1.33
Hence the equation of the line of best fit is
y = 0.72 + 1.33x.
Example : 2
42
141 = 42 a + 376 b a= 5 b–3
42 b - 3 + 376 b
42
141 =
5
1764
141 = b – 126 + 376 b.
5
3644 267 5
267 =
5 b. b = 3644
1335
b = = 0.3663556 = 0.37
3644
a = (8.4) (0.37) – 3= 0.108 = 0.11
x = 0.11 + 0.37 y.
Example : 3
x = 45,
x = 5, y = 74,
y = 8.22
Solution :
Let x = x – 5 and y = y – 7. and let the curve of best fit be
y = a + bx + cx2. The normal equations are
y = na + bx + cx2
xy = ax + bx2 + cx3
x2y = ax2 + bx3 + cx4.
x y x y xy x2 x3 x4
1 2 -4 -5 20 16 - 64 256
2 6 -3 -1 3 9 - 27 81
3 7 -2 0 0 4 -8 16
4 8 -1 1 -1 1 -1 1
5 10 0 3 0 0 0 0
6 11 1 4 4 1 1 1
7 11 2 4 8 4 8 16
8 10 -3 3 -9 9 27 81
9 9 4 2 8 16 64 256
= 46 74 0 11 51 60 0 708
11 = 9 a + 6c
51 = 60 b
-9 = 60 a + 708 c
a = 3, b = 0.85
c = - 0.27
Hence the curve of best fit is
y = 3 + 0.85x – 0.27 x2
y = 3 + 0.85 (x – 5) + 0.27 (x – 5)2
y = 3 + 0.85 x – 4.25 – 0.27x2 + 2.7 x – 6.75
y = - 1 + 3.55 x – 0.77 x2
Example : 4
m(15) + c (6) = -3
Solving the above equations simultaneously.
We get m = 3 and c = -8
Example : 5
Illustrative Examples
Example : 1
Fit an equation of the type y = abx to the following data by the method of least square
technique.
x: 2 3 4 5 6
y: 144 172.8 207.4 248.8 298.5
Solution : Taking logarithm to y = abx this equation becomes
Log y = Log a + (Log b) x
where Y = Log y, c0 = Log a, c1 = Log b.
Y = c0 + c 1 x
We have the following table
x Y = log y x2 xY
2 4.97 4 9.94
3 5.15 9 15.45
4 5.33 16 21.32
5 5.52 25 27.60
6 5.70 36 34.20
20 26.67 90 108.51
The normal equations are
Y = c01 + c1x
xY = c0x + c1x2
26.67 = 5 c0 + 20 c1
108.51 = 20c0 + 90 c1
By using cramer’s rule we solve
The above equations we get
c0 - c1 1
= =
230.1 - 9.15 50
c0 = 4.602, c1 = 0.183
Now, Log a = c0 a = eco = e4.602 = 99.68
Log b = c1 b = ec1 = e0.183 = 1.2008
y = (99.68) (1.2008)x.
Exercise No. 6
Example : 1
x: 1 2 3 4
f: 60 30 20 15
Ans.: a = 84.8, b = - 0.456
Example : 2
1. The arithmetic mean (x ) of following distribution
x 0 1 2 4
f 1 4 3 2
x 1 2 3 4 5
f 2 4 5 3 1
Explanation :
x1
2
2 38
= n (x ) = 3 – (3.33) = 1.25
2
4. For a given distribution if value of fx2 = 188, N = f = 10 x = 3.5 then value of
standard deviation () is given by
(a) 2.56 (b) 4.12 (c) 4.88 (d) 5.13
Ans.: (a)
Explanation :
x2
= n (x ) = 18.8 – (3.5) = 2.56
2 2
5. For a given distribution if value of fx2 = 122 , N = f = 5, x = 4 then value of
standard deviation () is given by
(a) 2 (b) 3.88 (c) 5.13 (d) 2.90
Ans.: (a)
Explanation :
x2 2
= n (x ) = 24.4 – 16 = 2.90
6. The standard deviation of the following frequency distribution is
Wages in rupees 0 – 10 10 – 20 20 – 30
earned per day
No. of labours 5 9 15
f1 u1 f1 u12
2
S.D. = = h –
f1 f1
20 102
= 1029 – 29 = 7.55
7. The following table gives the marks obtained in a paper of statistics out of 25
Class interval 0–5 5 – 10 10 – 15 15 – 20 20 – 25
No. of students 2 6 8 8 15
The standard deviation (S.D) is ________
(a) 7.29 (b) 7.35 (c) 6.30 (d) 5.75
Ans.: (c)
Explanation :
f1 u1 f1 u12
2
S.D = = h –
f1 f1
x–A 2
Class Middle value No. of f1 u1 f1 u1
u= h
interval x students f
A = 12.5
0–5 2.5 2 –1 –2 2
5 – 10 7.5 6 – 0.5 –3 1.5
10 – 15 12.5 8 0 0 0
15 – 20 17.5 8 0.5 4 2
20 – 25 22.5 15 1 15 15
– 39 – 14 20.5
2
⸫ S.D. = = 10 20.5 – 14 = 6.30
39 39
8. From the following data
Team S.D () A.M‒
x
A 2 5
B 2.5 4
The more consistent team is
(a) Team A (b) Team B
(c) Team equally consistent (d) can’t say
Ans.: (a)
Gigatech Publication House
Igniting Minds
Engineering Mathematics –III M3.4 Statistics (MCQ’s)
Explanation :
We have coefficient of variation
(C.V.) = x 100
2
For team A, (C.V)A = 5 100 = 40
2.5
For team B, (C.V)B = 4 100 = 62.5
(C.V.)A = 25
4
For team B, (C.V.)B = 16 100 = 25
11
For Saurav, (C.V.)C = 42 100 = 26.19
S.D = 2 = 4 = + 2
Variance = = (2)2 = 4
18. The value of second and third moments about mean of distribution are 2.83 and 2.38
respectively. The coefficient of skewness1 is equal to
(a) 0.10 (b) 0.50 (c) 0.20 (d) 0.30
Ans.: (b)
Explanation :
2
3 1(3.38)2
1 = 3 = (2.83)3 = 0.5040
2
19. The first four moments about the working mean 44.5 of a distribution are – 0.4, 2.99, –
0.08 and 27.63 then value of moments about mean 2 is ________
(a) 2.83 (b) 3.99 (c) 2.2 (d) 5.9
Ans.: (a)
Explanation :
Given A = 44.5
ʹ1 = – 0.4 , ʹ2 = 2.99 , ʹ3 = – 0.008 , ʹ4 = 27.63
2 = ʹ2–ʹ12 = 2.83
20. The first three moments about the working mean 44.5 of a distribution are – 0.4, 2.99,
– 0.08 then value of moment about mean 3 is
(a) 2.83 (b) 4.30 (c) 3.38 (d) 30.3
Ans.: (c)
Explanation : Given A = 44.5
ʹ1 = –0.4 ,ʹ2 = 2.99 , ʹ3 = – 0.008
3 = ʹ3= 3ʹ2ʹ1 + 2 ʹ13 = 3.38
21. The first four moments about the working mean 44.5 of a distribution are – 0.4, 2.99,
0.08 and 27.63 then value of moment about mean 4 is ________
(a) 15.19 (b) 30.30 (c) –29.20 (d) 37.99
Ans.: (b)
Explanation : Given A = 44.5
ʹ1 = –0.4 ,ʹ2 = 2.99 , ʹ3 = – 0.008, ʹ4 = 27.63
4 = ʹ4 – 4 ʹ3ʹ1 + 6 ʹ2ʹ22– 3 ʹ14 = 30.30
22. The first four moments about the working mean 44.5 of a distribution are – 0.4, 2.99,
– 0.04 and 27.63 then distribution is ________
(a) platykurtic (b) Mesokurtic (c) Leptokurtic (d) Equal distribution
Ans.: (c)
29. For the data of distribution : n = 100, fd = 50, fd2 = 1970, fd3 = 2948,
fd4 = 86752 where d = X – 48 then value of coefficient of skewness1 is ________
(a) 0.23 (b) 0.0000044 (c) – 0.20 (d) 0.35
Ans.: (b)
Explanation :
3 = ʹ3– 3 ʹ2ʹ1+ 2ʹ13 and A = 48
fd
50 1970
ʹ1 = n = 100 = 0.5 ; ʹ2 = 100 = 19.7
2948 86752
ʹ3 = 100 = 29.48 ; ʹ4 = 100 = 867.52
2 = ʹ2 – 3 ʹ12= 19.45
3
3 = ʹ3 – 3 ʹ2ʹ1+ 21 = 0.18
2
3
1 = 3 = 0.0000044 or 1 = 4.40 10–6
2
30. For the data of distribution : n = 100, fd = 50, fd2 = 1970, fd3 = 2948,
fd4 = 86752 where d = X – 48 then value of coefficient of Kurtosis 2 is ________
(a) 3.31 (b) 2.21 (c) 2.71 (d) 3.94
Ans.: (b)
Explanation :
Given : n = 100 and A = 48
fd
50 1970
ʹ1 = n = 100 = 0.5 ; ʹ2 = 100 = 19.7
2948 86752
ʹ3 = 100 = 29.48 ; ʹ4 = 100 = 867.52
2 = ʹ2–ʹ22 = 19.45
4 = ʹ4– 4ʹ3ʹ1 + 6 ʹ2ʹ12– 3ʹ14 = 837.92
4
2 = 2 = 2.21
2
31. The first four moments about working mean 30.2 of a distribution are 0.255, 6.222,
30.211 and 400.25 then central moment 2 is ________
(a) 7.32 (b) 1.92 (c) 6.16 (d) 3.62
Ans.: (c)
Explanation :
Given : A = 30.2
ʹ1 = 0.255 ; ʹ2 = 6.222
ʹ3 = 30.211 ; ʹ4 = 400.25
2 = ʹ2–ʹ12 = 6.16
32. The first four moments about working mean 30.2 of a distribution are 0.255, 6.222,
30.211 and 400.25 the central moment 3 is ________
(a) 25.48 (b) 11.35 (c) 32.29 (d) 17.32
Ans.: (a)
Explanation:
Given : A = 30.2
ʹ1 = 0.255 ; ʹ2 = 6.222
ʹ3 = 30.211 ; ʹ4 = 400.25
3 = ʹ3– 3ʹ2ʹ1 +2 ʹ13 = 25.48
33. The first four moments about working mean 30.2 of a distribution are 0.255, 6.222,
30.211 and 400.25 the central moment 4 is ___
(a) 371.85 (b) 341.57 (c) 270.71 (d) 291.53
Ans.: (a)
Explanation:
Given : A = 30.2
ʹ1 = 0.255 ; ʹ2 = 6.222
ʹ3 = 30.211 ; ʹ4 = 400.25
4 = ʹ4– 4ʹ3ʹ1 + 6 ʹ2ʹ12– 3 ʹ14= 371.85
34. The first four moments about working mean 30.2 of a distribution are 0.255, 6.222,
30.211 and 400.25 then value of coefficient of skewness1 is____
(a) 3.31 (b) 0.07 (c) 2.78 (d) 0.0
Ans.: (c)
Explanation :
Given : A = 30.2
ʹ1 = 0.255 ; ʹ2 = 6.222
ʹ3 = 30.211 ; ʹ4 = 400.25
2 = ʹ2–ʹ12 = 6.16 ;
' 3
3 = 3– 3ʹ2ʹ1 2 1 = 25.48
2
3
1 = 3 = 2.78
2
35. The first four moments about working mean 3.02 of a distribution are 0.255, 6.222,
30.211 and 400.25 then value of coefficient of skewness 2 is________
(a) 2.99 (b) 6.20 (c) 3.51 (d) 9.80
Ans.: (d)
Explanation : Given : A = 30.2
ʹ1 = 0.255 ; ʹ2 = 6.222
ʹ3 = 30.211 ; ʹ4 = 400.25
2 = ʹ2– ʹ12 = 6.16 ;
'
4 = 4 – 4ʹ3ʹ1 + 6 ʹ2ʹ12– 3ʹ14 = 371.85
4
2 = 3= 9.80
2
36. The first four moments about working mean of the distribution are 0, 2.5, 0.7 and
18.75 then moment about mean 2 is ________
(a) 11.5 (b) 12.5 (c) 2.5 (d) 7.5
Ans.: (c)
Explanation :
ʹ1 = 0 ; ʹ2 = 2.5
ʹ3 = 0.7 ; ʹ4 = 18.75
2 = ʹ2–ʹ12 = 2.5
37. The first four moments about working mean of the distribution are 0.25, 0.7 and
18.75 then coefficient of skewness 2 is ________
(a) 2.87 (b) 3 (c) 0.37 (d) 3.87
Ans.: (b)
Explanation :
ʹ1 = 0 ; ʹ2 = 2.5
ʹ3 = 0.7 ; ʹ4 = 18.75
2 = ʹ2–ʹ12 = 2.5 ;
3 2 3
2 = ʹ4– 4ʹ3ʹ1 + 6 21– 31 = 18.75
4
2 = 2= 3
2
38. The first four moment of a distribution about the value 5 are 2, 20, 40 and 50 then
the value of central moment 3 is ________
(a) 57 (b) 25 (c) – 64 (d) 40
Ans. (c)
Explanation :
ʹ1 = 2 ; ʹ2 = 20
ʹ3 = 40 ; ʹ4 = 50
3 = ʹ3– 3ʹ2ʹ1 + 2 ʹ3
1 = – 64
39. The first four moment of a distribution about the value 5 are 2, 20, 40 and 50 then
the value of central moment 4 is ________
(a) 50 (b) 157 (c) 22.39 (d) 162
Ans.: (d)
Explanation :
A = 5
ʹ1 = 2 ; ʹ2 = 20
ʹ3 = 40 ; ʹ4 = 50
4 = ʹ4– 4ʹ3ʹ1 + 6ʹ2ʹ12– 3 ʹ14= 162
40. The first four moments about working mean of the distribution are 0, 2.5, 0.7 and
18.75 then moment about mean 3 is ________
(a) 0.7 (b) 5.7 (c) 0.32 (d) 1.32
Ans.: (a)
Explanation :
ʹ4 = 0 ; ʹ2 = 2.5
ʹ3 = 0.7 ; ʹ4 = 18.75
3 = ʹ3– 3ʹ2ʹ1 + 2 ʹ13 = 0.7
41. The first four moments about working mean of the distribution are 0, 2.5, 0.7
and18.75 then moment about mean 4 is ________
(a) 18.75 (b) 22.35 (c) 17.32 (d) 91.40
Ans.: (a)
Explanation :
ʹ4 = 0 ; ʹ2 = 2.5
ʹ3 = 0.7 ; ʹ4 = 18.75
4 = ʹ4– 4ʹ3ʹ1 + 6ʹ2ʹ12– 3ʹ14 = 18.75
42. The first four moments about working mean of the distribution are 0, 2.5, 0.7, and
18.75 then coefficient of skewness 1 is ________
(a) 0.0 (b) 0.51 (c) 0.03 (d) – 0.5
Ans.: (c)
Explanation
ʹ4 = 0 ; ʹ2 = 2.5
ʹ3 = 0.7 ; ʹ4 = 18.75
2 = ʹ2– ʹ12 = 2.5 ;
3
3 = ʹ3– 3ʹ2ʹ1 + 2 1 = 0.7
2
3
1 = 3 = 0.03
2
43. The first two moments about the working mean 30.2 of a distribution are0.255,
6.222 then value of S.D. () is ________
(a) 3 (b) 3.22 (c) 2.48 (d) 5.7
Ans.: (c)
Explanation :
A = 30.2
ʹ1 = 0.255 ʹ2 = 6.222
S.D. () = 2 and 2 = ʹ2–ʹ22 = 6.16
S.D = 2.48
44. The first two moments about working mean 44.5 of a distribution are– 0.4, 2.99
then the value of the arithmetic mean (A.M) is ________
(a) 44.1 (b) 42.5 (c) 1.35 (d) 44.5
Ans.: (a)
Explanation :
A = 44.5
ʹ1 = –0.4 ʹ2 = 2.99
A.M. = A + ʹ1 = 44.1
45. The first two moments about the working mean 44.5 of a distribution are – 0.4, 2.99
then the value of standard deviation (S.D) () is ________
(a) 2.32 (b) 2.87 (c) – 0.32 (d) 1.68
Ans.: (d)
Gigatech Publication House
Igniting Minds
Engineering Mathematics –III M3.15 Statistics (MCQ’s)
Explanation : A = 44.5
ʹ1 = –0.4 ʹ2 = 2.99
S.D. () = 2 and 2 = ʹ2–ʹ12 = 2.83
S.D. = 1.68
46. The first two moments of a distribution about the value 5 are 2, 20 then A.M. is _____
(a) 16 (b) 7 (c) 4 (d) 13
Ans.: (b)
Explanation :
A = 5
ʹ1 = 2 ; ʹ2 = 20
A.M. = A + ʹ1 = 7
47. The first two moments about the working mean 5 of a distribution are 2, 20 then
standard deviation () is ________
(a) 7 (b) 9 (c) 4 (d) 16
Ans.: (c)
Explanation :
A = 5;
ʹ1 = 2 ʹ2 = 20
2 = ʹ2–ʹ22 = 16
S.D. () = 2 = 4
48. The value of central moment 2 of the following distribution is
x 1 2 3 4 5
f 6 15 23 42 62
(a) 2.75 (b) 3.01 (c) – 1.72 (d) 1.34
Ans.: (d)
Explanation : 2 = ʹ2–ʹ12
f1 d1
2
' f1 d1 '
And 1 = And2 =
f1 f1
2
x f d=x–A f1 d1 f1 d1
A=3
1 6 –2 – 12 24
2 15 –1 – 15 15
3 23 0 0 0
4 42 1 42 42
5 62 2 124 248
148 – 139 329
ʹ1 = 0.939 ʹ2 = 2.22
2 = ʹ2–ʹ22 = 1.34
For the data : n = 10; u = – 5.1, v = –10, ui vi = = 1242,ui = 1169, vi = 1694
2 2
49.
the value of coefficient of correlation r is ____
(a) 0.74 (b) 0.92 (c) 0.65 (d) 0.89
Ans.: (b)
Explanation :
n = 10,‒ u = – 5.1, ‒ v = –10, u v = 1242
i i
= 1169 ,
2 2
ui vi = 1694
cor (u,v) 1
r = and cov (u, v) = n uivi– ‒
u‒v = 73.2
uv
1
u = n ui –‒
2 2
u2 = 90.89 u = 9.53
1
v = n vi –‒
2 2
u2 = 69.4 v = 8.33
73.2
⸫ r = (9.53) (8.33) = 0.92
50. For a given distribution, if value of cov (x, y) = – 5.2, x = 2.82 the value of
regression coefficient byx is ________
(a) – 0.55 (b) – 0.85 (c) – 0.75 (d) – 0.65
Ans.: (d)
Explanation :
cor (x, y) – 5.2
We have byx = 2 = (2.85)2 = – 0.65
x
51. For a given distribution cov (x, y) = 35.25, y = 5.82 the value of regression
coefficient bxy is ____
(a) 1.80 (b) 2.8 (c) 1.04 (d) 2
Ans.: (c)
Explanation :
cor (x, y) 35.25
We have byx = 2 = (5.82)2 = 1.04
y
52. Given n = 25, x = 75, y = 100, x2 = 250, y2 = 500, xy = 325 then value of
coefficient of correlation r is ___
(a) – 0.7 (b) 0.8 (c) 0.5 (d) 0.9
Ans.: (c)
Explanation :
Given n = 25, x = 75, y = 100, x2 = 250,y2 = 500, xy = 325
cor (x, y)
r = ;
xy
x
x = n = 3;
y
y n =4=
1
Cor (x, y) = n xy– x y = 1
1
x = n x2–x 2= 1 x = 1
2
1
y = n y2– y 2= 1 y = 4
2
⸫ r = 0.5
53. The two regression equation of the variables x and y are x = 19.3 – 0.87 y, y = 11.64
– 0.50 x then values of x and y are ________
(a) x = 11.64 ; y = 19.3 (b) x = 0.50 ; y = 0.87
(c) x = 16.23 ; y = 3.52 (d) x = 17.3 ; y = 4.50
Ans.: (c)
Explanation : Two regression lines are
x = 19.3 – 0.87 y
y = 11.64 – 0.50 x
⸫ x + 0.87y = 19.3 ; 0.50x + y = 11.64
Sincce x and y satisfies these equations
x + 0.87 y = 19.3
0.50 x + y = 11.64
y = 16.23, y = 3.52
54. The two regression equation of the variables x and y are x = 19.3 – 0.87 ; y = 11.64
– 0.50 x then correlation coefficient between x and y is________
(a) 0.8 (b) 0.7 (c) – 0.66 (d) + 0.66
Ans.: (c)
Gigatech Publication House
Igniting Minds
Engineering Mathematics –III M3.18 Statistics (MCQ’s)
⸫ byx = 0.8
And the lines of regression of x on y is from equation (2),
9 107 107
x = 20 y + 20 = 0.45y + 20
⸫ bxy = 0.45
⸫ Correlation co–efficient r (x, y)
r (x, y) = byxbxy = 0.8 (0.45)
r = 0.6
59. The equation of two regression line obtained in a correlation analysis are 4x – 5y +
33 = 0; 20x – 9y – 107 = 0 and variance of y is 16 then variance of x is ________
(a) 9 (b) 8 (c) 3 (d) 4
Ans.: (a)
Explanation:
Since y = 4
bxyy (0.45) (4)
x = r (x, y) = 0.6 =2
11. The first two moments about the working mean 30.2 of a distribution are 0.255,
6.222 then value of S.D. () is ________
(a) 3 (b) 3.22 (c) 2.48 (d) 5.7
12. The value of central moment 2 of the following distribution is
x 1 2 3 4 5
f 6 15 23 42 62
(a) 2.75 (b) 3.01 (c) – 1.72 (d) 1.34
13. Given n = 25, x = 75, y = 100, x2 = 250, y2 = 500, xy = 325 then value of
coefficient of correlation r is ________
(a) – 0.7 (b) 0.8 (c) 0.5 (d) 0.9
2
14. Two regression lines are 5y – 8x + 17 = 0 and 2y – 5x + 14 = 0 also y = 16 then
variance of x is ________
(a) 8 (b) 2 (c) 16 (d) 4
15. If two lines of regression are 9x + y – = 0 and 4x + y = and mean of x and y are
respectively x – 2, y = – 3 then values of and are
(a) = 15; = 5 (b) = 2 ; = – 3 (c) = 19 ; = 17(d) = 12 ; = 15
16. The probability of drawing Ace from a well shuttled pack of cards is ________
1 3 3 1
(a) 52 (b) 13 (d) 52 (d) 13
17. Two dice are thrown. The probability of getting double is
1 1 4 1
(a) 6 (b) 36 (c) 36 (d) 3
18. If probability of success p = 0.7 then probability of failure q = ________
(a) 0.7 (b) 1.7 (c) – 0.7 (d) 0.3
19. Two dice are thrown at a time. What is the probability of getting 10 points.
2 1 1 1
(a) 3 (b) 4 (c) 6 (d) 12
20. 20% of bolts produced by machine are defective. The mean and standard deviation
of defective bolts in total of 900 bolts are respectively.
(a) 180 and 12 (b) 12 and 100 (c) 12 and 180 (d) 9 and 0.8
21. Slope of regression line of x on y is
x x y
(a) r (b) r(x, y) (c) (d)
y y x
22. In regression line y on x, byx is given by
cov(x, y) cov(x, y)
(a) cov (x, y) (b) r(x, y) (c) 2 (d) 2
x y
23. If the two regression coefficient are 0.16 and 4 then the correlation coefficient is
(a) 0.08 (b) –0.8 (c) 0.8 (d) 0.64
Gigatech Publication House
Igniting Minds
Engineering Mathematics –III M3.23 Statistics (MCQ’s)
24. If covariance between x and y is 10 and the variance of x and y are 16 and 9
respectively then coefficient of correlation r(x, y) is
(a) 0.833 (b) 0.633 (c) 0.527 (d) 0.745
25. Given the following data r = 0.5, xy = 350, x = 1, y = 4, –x = 68, –
y = 62.125. The
value of n (number of observation) is
(a) 5 (b) 7 (c) 8 (d) 10
Answers Key
1. (c) 2. (a) 3. (a) 4. (a) 5. (c) 6. (a) 7. (a) 8. (b) 9. (a) 10. (b)
11. (c) 12. (d) 13. (c) 14. (d) 15. (a) 16. (d) 17. (a) 18. (d) 19. (d) 20. (a)
21. (a) 22. (c) 23. (c) 24. (a) 25. (a)