0% found this document useful (0 votes)
29 views117 pages

Statistics Concepts and Methods Guide

Uploaded by

2005luciapeter
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views117 pages

Statistics Concepts and Methods Guide

Uploaded by

2005luciapeter
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Unit

3
Statistics

Syllabus :
Measures of central tendency, Standard deviation, Coefficient of variation, Moments,
Skewness and Kurtosis, Curve fitting : fitting of straight line, parabola and related
curves, Correlation and Regression, Reliability of Regression Estimates.

 Definition :
Statistics is the science which deals with methods of collecting, classifying,
Presenting, comparing numerical data collected to throw light on any sphere of enquiry.
 Variable (or Variate) :
A quantity which can vary from one individual to another is called a variable or variate.
e.g. Heights, weights, ages, wages of persons, rain fall records of cities, Income.
Quantities which can take any numerical value within a certain range are called continuous
variables.
e.g. Height, weight, temperature, time, As the child grows, his/her height takes all
possible values from 50cm to 100cm. No. of rooms in a house.
Quantities which are incapable of taking all possible values are called discrete or
discontinuous variables.
[Discrete: The variable which can assume only particular values are called as discrete
variables. e.g. No. of children in a family, No. of defective in a lot.]
e.g. No. of workers in a factory, No. of defective products, the no. of telephone calls on
different dates.
Ungrouped data: The data does not give any useful information, it is rather confusing
to mind, these are called raw data or ungrouped data.
Gigatech Publishing House
Igniting Minds
Engineering Mathematics - III 3.2 Statistics

Grouped data: If we express the data in ascending or descending order of magnitude,


this does not reduce the bulk of the data we condense the data into classes or groups.
Range: Difference between largest and smallest numbers occurring in the data.
Frequency distribution: It is a tabular arrangement by which large mass of raw data is
summarized by forming number of groups or categories.
Exclusive and Inclusive class- intervals: Class- intervals of the type{x: a  x < b} = [a,
b] are called ‘exclusive’ since they exclude the upper limit of the class.
The following data are classified on this basis.
Income (Rs.) 50 - 100 100-150 150-200 200-250 250-300
No. of persons 88 70 52 30 23
In this method, the upper limit of one class is the lower limit of the next class.
In this example there are 88 persons whose income is from Rs.50 to Rs.99.99.
A person whose income is Rs.100 is included in the class Rs.100 to Rs.150.
Class- intervals of the type {x: a  x  b} = [a, b] are called ‘Inclusive’ since they
include the upper limit of the class. The following data are classified on this basis.
Income (Rs.) 50 – 99 100 – 149 150 – 199 200 – 249 250 – 299
No. of persons. 60 38 22 16 7
However, to ensure continuity and to get correct class-limits exclusive method of
classification should be adopted. To convert inclusive class- interval into exclusive, we have to
make an adjustment.
3.1 Measure of Central Tendency :
A figure which is used to represent a whole series should neither have the lowest value
nor the highest in the series, but somewhere between these two limits, possibly in the center,
where most of the items of the series cluster, such figures are called Measures of central
tendency (or Averages).
 There are five types of averages in common use:
1. Arithmetic (Average or) Mean. 2. Median
3. Mode 4. Geometric Mean
5. Harmonic mean

1. Arithmetic Mean:
a) In case of individual observations: (i.e. where frequency is not given)
i) Direct Method: If the variable ‘x’ takes the values x1, x2, x3,…..,xn then A.M.

x is given by 
1 1
x = n (x1 + x2 + x3 + ⋯ + xn) = n ∑x

Gigatech Publishing House


Igniting Minds
Engineering Mathematics - III 3.3 Statistics

ii) Short cut method (or shift of origin) : Shifting the origin to an arbitrary point ‘A’
then A.M. 
x is given by 
1
x = A + n ∑d

where deviation d = x-A.


n = no. of observations, A = assumed mean.
b) In the case of discrete series: (i.e. where frequency is given)
i) Direct method: The frequency distribution is
x x1 x2 x3 … xn
f f1 f2 f3 … fn

then 
f1 x1 + f2 x2 +⋯ + fn xn 1
x= = N ∑fx where N = f1 + f2 + f3 + ⋯ + fn
f1 + f2 + f3 + ⋯ + fn
ii) Short cut method (or shift of origin): Shifting the origin to an arbitrary point ‘A’
then A.M. 
x is given by 
1
x = A + N ∑ fd where deviation d = x-A.

N = f1 + f2 + f3 + ⋯ + fn, A = assumed mean.


Note : If the frequencies are given in terms of class- intervals the mid values of
class-intervals are considered as ‘x’.
iii) In the case of continuous series having equal class- intervals say of width ‘h’ we
use a different formula.
(i.e. shift of origin and change of scale or step deviation method ).

Let u = h then A.M.  x is given by 


x-A h
x = A + N ∑ fu where

N = f1 + f2 + f3 + ⋯ + fn, A = assumed mean.


Weighted Arithmetic mean: If the variate values are not of equal importance, we
may attach to them ‘weight’ w1, w2, w3, … wn as measures of their importance.
The weighted mean  x is defined as
W

 w1 x1 + w2 x2 + w3 x3 + ⋯ + wn xn
xW =
w1 + w2 + w3 + ⋯ + wn
∑wx
= ∑w .

Property: (Mean of composite series) : If  xi (i = 1, 2,…k) be the Arithmetic mean


of ‘k’ distributions with respective frequencies n (i = 1, 2, 3, …k) then mean 
i x of
the whole distribution is given by

 n1 
x 1 + n2 x 2 + ⋯ + nk 
xk ∑n
x
x= = ∑n
n 1 + n2 + ⋯ + nk

Gigatech Publishing House


Igniting Minds
Engineering Mathematics - III 3.4 Statistics

2. Median:
i) Median is the measure of central value of the variable when the values are arranged
in ascending or descending order of magnitude.
(Median divides the distribution into two equal parts)
e.g. 3, 4, 4, 5, 6, 7, 8, 3, 4, 4, 5, 7, 9, 11, 13, 15, 17.
7+9
Median = 5, 2 = 8

ii) For an ungrouped frequency distribution if the ‘n’ values of the variate are
arranged in ascending or descending order of magnitude.
n + 1 th
a) when n = odd the middle value i.e. 2  value gives the median.
 
n th th
when n = even there are two middle values 2 and 2 + 1 . The
n
b)
   
arithmetic mean of these two values gives the median.
iii) For a grouped frequency distribution the median is given by the formula:
Median = L + f  2 - C where L = lower limit of median class, where median
h N
 
N
class is the class corresponding to cumulative frequency just greater than 2 .

h = the width of median class. f = the frequency of the median class.


C = cumulative frequency of the class preceding the median class. N = ∑f.
iv) For discrete frequency distribution: median is obtained by considering cumulative
frequencies.
N+1 N+1
Find 2 where N = ∑f find cumulative frequency just greater than 2 the
corresponding value of ‘x’ is the median.
3. Mode:
i) Mode is the value which occurs most frequently in a set of observations. [The mode
or modal value of the distribution is that value of the variate for which frequency is
maximum].
e.g. i) 2, 3, 3, 3, 5, 7, 7, 9 mode = 3
ii) 2, 3, 2, 4, 2, 5, 7, 5, 6, 8, 9 mode = 3.
iii) 1, 3, 5, 7, 8, 10 no mode.
iv) 7, 4, 3, 5, 6, 3, 3, 2, 4, 3, 4, 3, 3, 4, 4, 3, 2, 2, 4, 3, 5, 4, 3, 4, 3, 4, 3, 1, 2, 3
mode = 3.
x 1 2 3 4 5 6 7
frequency 1 4 12 9 2 1 1

Gigatech Publishing House


Igniting Minds
Engineering Mathematics - III 3.5 Statistics

ii) In case of continuous frequency distribution mode is given by the formula:


f m - f1
Mode = L+ h 2f - f - f Where fm = frequency of the modal class.
m 1 2

are the frequencies of the classes preceding and succeeding the modal
class respectively. L = lower limit, h = length of the interval.
iii) where mode is ill-defined i.e. where the method of grouping also fails, its value can
be ascertained by the formula Mode = 3median – 2 mean.
This measure is called the empirical mode.
Mean – Mode = 3[mean - median]
Harmonic Mean: Harmonic mean of a number of observations is the reciprocal of
the arithmetic mean of the reciprocals of the given values. Thus the harmonic mean
H of ‘n’ observations x1, x2, x3, …xn is
1 n
H = 1 1 =1
1 1 1
n ∑ x x1 + x2 + x3 + ⋯+ xn
if x1, x2, x3, …xn (none of them being zero) have the frequencies f1, f2, f3, …fn
respectively the harmonic mean is given by
1 N
H= 1 f =f
1 f2 f3 fn .
∑ + + + ⋯+
N x x1 x2 x3 xn
3.2 Measures of Dispersion :
 Dispersion:
The variation or scattering or deviation of the different values of a variable from their
average is known as dispersion. Dispersion indicates the extent to which the values vary
among themselves.
Distribution A 75 85 95 105 115 125
Distribution B 10 20 30 70 180 290
600
Arithmetic Mean of each distribution is 6 = 100.

In distribution A, the values of the variate differ from 100 but the difference is small, In
distribution B the values(or items) are widely scattered and lie far from the mean. Although the
A.M. is the same, yet the two distribution widely differ from each other in their formation.
The following are the Measures of Dispersion:
i) Range
ii) Quartile deviation or semi inter quartile deviation
iii) Average (or Mean) deviation
iv) Standard deviation
Gigatech Publishing House
Igniting Minds
Engineering Mathematics - III 3.6 Statistics

i) Range : Range is the difference between the extreme values of the variate.
Range = L – S where L = largest S = smallest.
L-S
Co-efficient of Range = .
L+S
ii) Average deviation or Mean deviation: if x1, x2, x3, …xn occurs f1, f2, f3, …fn times
respectively and N = ∑f the mean deviation from the average A(usually Mean or
Median) is given by
1
Mean Deviation = N ∑f | x – A | where | x – A | represents the modulus or the
absolute value of the deviation (x - A).
Mean Deviation
Co-efficient of mean deviation = Average from which it is calculated

iii) Standard deviation (S.D.):


Root Mean Square Deviation (R.M.S.): The r.m.s. deviation denoted by S is
defined as the positive square root of the mean of the squares of the deviations from
an arbitrary origin A.
1
thus S = + ∑f (x - A)2
N

when the deviations are taken from the mean 


x the r.m.s. deviation is called the
standard deviation and is denoted by .

Thus S. D. = 
1 2
N ∑f (x - x)
Note: The Square of the S.D. (i.e. 2) is called Variance.
Variance = (S.D.)2 = σ2.
 Short- cut methods for calculating standard deviation:
2
i) Direct Method:  =
1
∑fx2
-  1  fx
N N 
ii) Change of origin:
Let the origin be shifted to an arbitrary point ‘A’ and d = x – A then
2
∑fd2 -   fd
1 1
=
N N 
iii) Shift of origin and change of scale (or step deviation method):
Let the origin be shifted to an arbitrary point ‘A’ and the new scale be times the
2
x-A 1  1  fu
h then  = h
original scale let u = ∑fu2
-
N N 
Gigatech Publishing House
Igniting Minds
Engineering Mathematics - III 3.7 Statistics

 Relation between measures of dispersion:


2
i) Quartile deviation = 3 S.D.
4
ii) Mean deviation = 5 S.D.

 Coefficient of Variation:

The ratio of the S.D. to the mean i.e. is as the coefficient of variation. As this is ratio

x
having no dimension. It is used for comparing the variations between two groups with
different means.
σ
C.V. = × 100

x

Illustrative Examples
Example : 1

The mean yearly salary of employees of a company was Rs.20,000 the mean yearly
salaries of male and female employees were Rs.20,800 and Rs.16, 800 respectively. Find out
the percentage of males and females employed by the company.
Solution :
Let P1 and P2 represent percentage of males and females respectively.

then P1 + P2 = 100. Mean annual salary of all employees 


x = Rs.20,000

Mean annual salary of male employees 


x1 = Rs. 20,800

Mean annual salary of Female employees 


x2 = Rs.16,800

x1 P1 + 
Now 
x =
x2 P2
100
20.800 P1 + 16.800 P2
20,000 =
100
208 P1 + 168 P2 = 20.000
26 P1 + 21 P2 = 2500
26P1 + 21(100 - P1) = 2500
P! = 80, P2 = 20
Hence % of males and females is 80 and 20 respectively.

Gigatech Publishing House


Igniting Minds
Engineering Mathematics - III 3.8 Statistics

Example : 2

Find the mean of the following data:


Marks below 10 20 30 40 50 60 70 80 90 100
No. of students 5 9 17 29 45 60 70 78 83 86
The frequency distribution table can be written as:
Mid value x-A
Marks f u= fu
(x) h
0 – 10 5 5 -5 -25

10 – 20 15 4 -4 -16

20 – 30 25 8 -3 -24

30 – 40 35 12 -2 -24

40 – 50 45 16 -1 -16

50 – 60 55 15 00 0

60 – 70 65 10 1 10

70 – 80 75 8 2 16

80 – 90 85 5 3 15

90 – 100 95 3 4 12

f = 86 fu = - 52

Mean 
h 10
x = A + N fu = 55 + 86 (- 52) = 48.95 marks.

Example : 3

Obtain the median of the following frequency distribution:


x 1 2 3 4 5 6 7 8 9
f 8 10 11 16 20 25 15 9 6
c.f. 8 18 29 45 65 90 105 114 120
N+1
Here N = 120 
2 = 60.5
N+1
cumulative frequency just greater than
2 is 65 is 5 hence median is 55.

Gigatech Publishing House


Igniting Minds
Engineering Mathematics - III 3.9 Statistics

Example : 4

Find the median from the following data:


Marks 10-20 20-30 30-40 40-50 50-60 60-70 70-80
No. of students (f) 24 18 14 10 42 22 24
c.f. 24 42 56 66 108 130 154
N 154
N = ∑f = 154, 2 = 2 = 77

Median class is 50 – 60 and L = 50 h = 10, f = 42, c = 66

Here Median = L +  - c = 50 +
h N 10
(77 - 66) = 52.62.
f 2  42
Example : 5

Find the median from the following data:


Marks 0-10 10-20 20-30 30-40 40-50 50-60 60-70 70-80
No. of students (f) 15 20 25 24 10 33 71 51
c.f. 15 35 60 84 94 127 198 249
N 249
N = ∑f = 249, 2 = 2 = 124.5

Median class is 50 – 60 and L = 50, h = 10 f = 33, c = 94

= L + f  2 - c
h N
Here Median
 
10
= 50 + (124.5 - 94) = 59.24 marks.
33
Example : 6

Find the mode of the following data:


Marks 1- 5 6-10 11-15 16-20 21-25 26-30 31-35 36-40 41-45
No. of 7 10 16 32 24 18 10 5 1
candidates
(f)
Here the greatest frequency is 32 lies in the class interval 16 – 20. Hence the modal class
is 16 – 20. But the actual limits of this class are 15.5 – 20.5
L = 15.5, fm = 32, f1 = 16, f2 = 24, h = 5
f m - f1 32 - 16
Mode = L + 2f - f - f × h = 15.5 + 64 – 16 - 24 × 5 = 18.83
m 1 2

Gigatech Publishing House


Igniting Minds
Engineering Mathematics - III 3.10 Statistics

Example : 7

Find the Harmonic Mean of the following data:


Marks 1 f
f x x
x
10 2 0.100 0.200
20 3 0.050 0.150.
40 6 0.025 0.150
60 5 0.017 0.085
120 4 0.0008 0.032
f
∑ x = 0.617
N 20
Harmonic Mean = f = 0.617 = 32.4
∑x

Example : 8

Find the Mean deviation from the median of the following frequency Distribution,
Marks Mid value (x) f c.f. | x – Md | f | x – Md |

0-10 5 5 5 23 115
10-20 15 8 13 13 104
20-30 25 15 28 3 45
30-40 35 16 44 7 112
40-50 45 6 50 17 102
f = 50 f |x – Md | = 478

Here
N 50
2 = 2 = 25 Median class corresponds to c.f. 28 i.e.
Median class is 20 – 30.
h N  10
Median Md = L + - C = 20 + (25 - 13) = 28.
f 2  15
1 478
Mean Deviation from Median = ∑f | x - Md | = = 9.56.
N 50

Gigatech Publishing House


Igniting Minds
Engineering Mathematics - III 3.11 Statistics

Example : 9

Find the Mean and Standard Deviation of the following series.


No. of
Marks x - 47.5
candidates Mid-values u=
5 fu fu2
obtained
(f)
15 – 20 2 17.5 –6 – 12 72
20 – 25 5 22.5 –5 – 25 125
25 – 30 8 27.5 –4 – 32 128
30 – 35 11 32.5 –3 – 33 99
35 – 40 15 37.5 –2 – 30 60
40 – 45 20 42.5 –1 – 20 20
45 – 50 20 47.5 0 0 0
50 – 55 17 52.5 1 17 17
55 – 60 16 57.5 2 32 64
60 – 65 13 62.5 3 39 117
65 – 70 11 67.5 4 44 176
70 – 75 5 72.5 5 25 125
N = 143 5 1003

Mean 
h 5
x = A + N ∑fu = 47.5 + 143 × 5 = 47.7

2
∑fu2 -  ∑fu
1 1
S. D. = x = hu = h
N N 
1003  5 2
= 5
143 - 143 =13.2

Example : 10

The following data related to the ages of a group of government employees.


Calculate the Mean and Standard Deviation of the following series.
x - 37.5
Age No. of Employees (f) Mid-values u=
5 fu fu2

20 – 25 170 22.5 -3 – 510 1530

Gigatech Publishing House


Igniting Minds
Engineering Mathematics - III 3.12 Statistics

25 – 30 110 27.5 -2 – 220 440


30 – 35 80 32.5 -1 – 80 80
35 – 40 45 37.5 0 0 0
40 – 45 40 42.5 1 40 40
45 – 50 30 47.5 2 60 120
50 – 55 25 52.5 3 75 225
N = 500 – 635 2435

Mean 
h 5
x = A + N ∑fu = 37.5 + 500 × (- 635) = 31.15

2
1 2 1 
S. D. = x = hu = h ∑fu - ∑fu
N N 
2435  6352
= 5
500 - - 500 = 9.0237187

Example : 11

Calculate the Mean and Standard Deviation of the following data.


x - 25
Marks obtained No. of students(f) Mid-values u=
10 fu fu2

0 – 10 5 5 –2 – 10 20
10 – 20 8 15 –1 –8 8
20 – 30 15 25 0 0 0
30 – 40 16 35 1 16 16
40 – 50 6 45 2 6 24
N = 50 4 68

Mean 
h 10
x = A + N ∑fu = 25 + 50 × 4 = 25.8

2
1 2 1 
S. D. = x = hu = h ∑fu - ∑fu
N N 
68  4 2
= 10
50 - 50
= 11.634432

Gigatech Publishing House


Igniting Minds
Engineering Mathematics - III 3.13 Statistics

Example : 12

Calculate the Mean and Standard Deviation of the following data.


Marks No. of x - 25
Mid-values u=
10 fu fu2
obtained students(f)
0 – 10 12 5 –2 – 24 48
10 – 20 15 15 –1 – 15 15
20 – 30 40 25 0 0 0
30 – 40 22 35 1 22 22
40 – 50 11 45 2 22 44
N = 100 5 129

Mean 
h 10
x = A + N ∑fu = 25 + 100 × 5 = 25.5

2
1 2 1 
S. D. = x = hu = h ∑fu - ∑fu
N N 
129  5 2
= 10 - = 11.346806
100 100
Example : 13

Calculate the Mean and Standard Deviation of the following data giving the age
distribution of 542 members.
x - 55
Age in years No. of members(f) Mid-values u=
10 fu fu2

20 – 30 3 25 –3 –9 27
30 – 40 61 35 –2 – 122 244
40 – 50 132 45 –1 – 132 132
50 – 60 153 55 0 0 0
60 – 70 140 65 1 140 140
70 – 80 51 75 2 102 204
80 – 90 2 85 3 6 18
N = 542 – 15 765

Mean 
h 10
x = A+ ∑fu = 55 + × (- 15) = 54.723247
N 542

Gigatech Publishing House


Igniting Minds
Engineering Mathematics - III 3.14 Statistics

2
∑fu2 -  ∑fu
1 1
S. D. = x = hu = h
N N 
765 - 152
= 10 542 -  542 = 11.877176
Example : 14

Calculate standard deviation for the following frequency distribution.


Decide whether arithmetic mean is good average.
Wages in Rs. No. of x - 25
Mid-values u=
10 fu fu2
per day Labourers (f)
0 – 10 5 5 –2 – 10 20
10 – 20 9 15 –1 –9 9
20 – 30 15 25 0 0 0
30 – 40 12 35 1 12 12
40 – 50 10 45 2 20 40
50 – 60 3 55 3 9 27
N = 54 22 108

Mean 
h 10
x = A+ ∑fu = 25 + × (22) = 29.074
N 54
2
1 2 1 
S. D. = x = hu = h ∑fu - ∑fu
N N 
108  222
= 10 54 -  54  = 13.54
S.D. 13.54 is quite a large value and A.M. 29.074 is not a good average.
54
(The mean = 6 = 9 it is distorted by the usually high labourers compared to other
labourers.) or S.D. is very much deviated from arithmetic mean therefore A.M. is not
good average.
Example : 15

Goals scored by two teams A and B in football season are:


No. of goals in a match 0 1 2 3 4
A 27 9 8 5 4
No. of Matches.
B 17 9 6 5 3

Gigatech Publishing House


Igniting Minds
Engineering Mathematics - III 3.15 Statistics

Find out which team is more consistent scorer.


Solution : Frequency distribution table for Team- A and Team-B
Team-A Team-B
Match Match
No. of d=x 2 No. of d = x-
es fd fd es fd fd2
goals –2 goals 2
f f
0 27 –2 – 54 108 0 17 –2 – 34 68
1 9 –1 –9 9 1 9 –1 –9 9
2 8 0 0 0 2 6 0 0 0
3 5 1 5 5 3 5 1 5 5
4 4 2 8 16 4 3 2 6 12
53 – 50 138 40 – 32 94
For Team-A For Team-A
 1 - 50
x = A + N ∑fd = 2 + 53 = 1.06  1 -32
x = A + N ∑fd = 2 + 40 = 1.2

2 2
1 2 1  1 2 1 
A = ∑fd - ∑fd B = ∑fd - ∑fd
N N  N N 
138  502 94  322
A = B =
53 - - 53 = 1.31 40 - - 40 = 1.3
A 1.31 B 1.3
c. v. =  100 = 1.06  100 c. v. =  100 = 1.2  100

x 
x
= 123.6% = 108.3%
Since (c.v.)B < (c.v.)A Therefore Team-B is more consistent.
Example : 16

The following are scores of two batsmen A and B in a series of innings:


2 2
A(x) B(y) dA = x - 51 dB = y-51 dA dB

12 47 -39 -4 1521 16
115 12 64 -39 4096 1521
6 16 -45 -35 2025 1225
73 42 22 -9 484 81

Gigatech Publishing House


Igniting Minds
Engineering Mathematics - III 3.16 Statistics

7 4 -44 -47 1936 2209


19 51 -32 0 1024 0
119 37 68 -14 4624 196
36 48 -15 -3 225 9
84 13 33 -38 1089 1444
29 0 -22 -51 484 2601
=500 =270 = - 10 = - 240 = 17508 = 9302
Find out which batsman is more consistent.
 1 10
x = A + n ∑d = 51 - 10 = 50

 1 240
y = A + n ∑d = 51 - 10 = 27

2
A = 1 ∑d2 - 1 ∑d
n  n 
2
=
1
(17508) - - 10
10  10
= 41.83, C.V. = 83.6%
2
B = 1 ∑d2 - 1 ∑d
n  n 
2
=
1
(9302) - - 240
10  10 
= 18.82, C.V. = 69.6%
A.M. of A > A.M. of B, ( ) ( ) , B is more consistent.
3.3 Moments, Skewness, Kurtosis :
 Moments:
The rth moment about any point A is denoted by r′ and is defined as
1
r′ = N ∑f (x - A)r where N = ∑f. It can be seen that putting r = 0, 1, 2, 3, 4…..etc. we
get.

0′ = 1, 1′ = N ∑f (x - A) = N ∑fx – A or μ1′ = 


1 1
x-A
1
2′ = N ∑f (x - A)2 = S2, the mean square deviation.

Gigatech Publishing House


Igniting Minds
Engineering Mathematics - III 3.17 Statistics

1
3′ =
N ∑f (x - A) and so on.
3

The moment about the mean  x of a distribution is denoted by r and is given by


1  
r = N ∑f (x - x) where x is arithmetic mean of the distribution.
r

Putting r = 0, 1, 2, 3, 4…..etc. we get.

0 = N ∑f = 1 , μ1 = N ∑f (x - 
1 1
x) = 0,

∑f (x - 
1
μ2 = x)2, this gives the variance of the distribution.
N

μ3 = N ∑f (x - 
1
x)3, this gives the third moment of the distribution about the

mean and so on.


 Relation between r and r′ :
The value of r′ can be calculated with much less calculations as compared to r by
selecting appropriate A, we have seen this in calculation of Arithmetic Mean (A.M.) and
standard Deviation (S.D.) hence we express r in terms of r′ .
By definition,
r′
1 r 1 r
N ∑f (x - x) = N ∑f (x – A + A - x) ,
=

let d = x – A

1 1 A  
N ∑fd = N ∑fx - N ∑f or d = x – A = 1

∑f (d -  r
1
Thus r = r
d) Expand (d - d ) binomially we get
N

r =
1 r r-1  r r-2  2 r  r
N ∑f (d - c1 d d + c2 d (d) + ⋯+ (-1) (d) )
r

r = r′ - c1r 1′ r′ + c2r r′ (r′ )2 + …+ (-1)r (1′ )r

r, 
1
N ∑fd = d = 1′ , we seen that, μ0 = 1, μ1 = 0
r
where

by putting r = 2, 3, 4,……etc. we get


2 = 2′ - (1′ )2
3 = 3′ – 32′ 1′ + 2 (1′ )3
4 = 4′ – 43′ 1′ + 62′ (1′ )2 – 3(1′ )4.
 Skewness:
Skewness signifies departure from symmetry. We study skewness to have an idea about
the shape of the curve which we draw with the given data.
Gigatech Publishing House
Igniting Minds
Engineering Mathematics - III 3.18 Statistics

If the frequency curve stretches to the right as in fig.(a) i.e. the mean is to the right of the
mode then the distribution is right skewed or is said to have positive skewness.
If the curve stretches to left of mode is to the right of the mean then the distribution is
said to have negative skewness.
The different measures of skewness are:
3 (mean-median)
i) Pearson’s coefficient of skewness . = standard deviation
μ23
ii) Coefficient of skewness 1 =
μ32
 Kurtosis:
To get complete idea of the distribution in addition to the knowledge of mean dispersion
and skewness, we should have an idea of the flatness or Peakedness of the curve. It is
μ4
measured by the coefficient 2 is given by β2 = μ2 .
2

The curve of fig.(a) which is neither flat nor peaked is called the normal curve or
Mesokurtic curve γ = β2 - 3. Gives the excess of kurtosis. For a normal distribution β2 = 3 and
the excess is zero. The curve of fig.(c) which is flatter than the normal curve is called
Platykurtic and that of fig.(b) which is more peaked is called Leptokurtic. For Platykurtic
curves β2 < 3. For Leptokurtic curves β2 > 3.
(Skewness: Measures the degree of asymmetric or the departure from symmetry.
Kurtosis: Measures the degree of Peakedness of a distribution.)

Illustrative Examples

Example : 1

If the first four moments of a distribution about the value 5, are equal to -4, 22, -117 and
560, determine the central moments (β1) and (β2).
Solution :
The first four moments about the arbitrary origin 5 are
1′ = - 4, 2′ = 22, 3′ = - 117, 4′ = 560

1′ =
1 1 
N ∑f (x – 5) = N ∑fx – 5 = x - 5

∴ Mean = 
x = μ1′ + 5 = - 4 + 5 = 1
μ2 = μ2′ - (μ1′ )2 = 22 - (- 4)2 = 6
μ3 = μ3′ - 3μ2′ μ1′ + 2(μ1′ )3
= -117 - 3(22) (- 4) + 2(- 4)3 = 19.
μ4 = μ4′ - 4μ3′ μ1′ + 6μ2′ (μ1′ )2 - 3(μ1′ )4
Gigatech Publishing House
Igniting Minds
Engineering Mathematics - III 3.19 Statistics

= 560 - 4(- 117) (- 4) + 6(22) (- 4)2 - 3(- 4)4


μ4 = 32
μ32 (19)2
Coefficient of skewness = β1 = = = 1.6773
μ23 (6)3
μ4 32
Kurtosis = β2 = μ2 = (6)2 = 0.8889.
2

Example : 2

Calculate the first four moments about the mean of the given distribution. Also find
skewness 1 and kurtosis (2).
x 2.0 2.5 3.0 3.5 4.0 4.5 5.0
f 4 36 60 90 70 40 10
Solution :
x-A
Taking A = 3.5 and u =
h
x-3.5
=
0.5
we prepare the table for calculating 1′ , 2′ , 3′ , 4′ , 1, 2.
x - 3.5
x f u=
0.5 fu fu2 fu3 fu4

2.0 4 –3 – 12 36 – 108 324


2.5 36 –2 – 72 144 – 288 576
3.0 60 –1 – 60 60 -60 60
3.5 90 0 0 0 0 0
4.0 70 1 70 70 70 70
4.5 40 2 80 160 320 640
5.0 10 3 30 90 270 810
N = f fu = 36 fu = 560
2
fu = 204
3
fu = 2480
4

= 310
hr ∑fur hr
r′ =
∑f = N ∑ fu
r
Here

h 0.5
Now, μ1′ = N ∑fu = 310 × 36 = 0.058064

Gigatech Publishing House


Igniting Minds
Engineering Mathematics - III 3.20 Statistics
2 2
h (0.5)
2′ =
N ∑fu = 310 × 560 = 0.451612
2

h3 (0.5)3
3′ = ∑fu3
=
N 310 × 204 = 0.08225
h4 (0.5)4
4′ = ∑fu4
=
N 310 × 2480 = 0.5
∴ μ1 = 0,
μ2 = 2′ – (μ1′ )2 = 0.451612 – (0.058064)2 = 0.44824
μ3 = 3′ - 32′ μ1′ + 2(μ1′ )3 = 0.08225 - 3(0.451612) (0.058064) + 2(0.058064)3
= 0.0039826
μ4 = 4′ - 43′ μ1′ + 62′ (μ1′ )2 - 3(μ1′ )4
= 0.5 - 4(0.08225)(0.058064) + 6(0.451612)(0.058064)2 - 3(0.058064)4
μ4 = 0.48999.
μ32 (0.0039826)2
Coefficient of skewness = 1 = μ3 = (0.44824)3 = 1.76549  10- 4.
2

μ4 0.48999
Kurtosis = 2 = = = 2.43874.
μ22 (0.44824)2

Example : 3

Calculate the first four moments of the following distribution about the mean and hence
find skewness 1 and kurtosis (2).
Solution : First we calculate the moments about assumed mean x = 4.
x f d=x-4 fd fd2 fd3 fd4
0 1 -4 -4 16 -64 256
1 8 -3 -24 72 -216 648
2 28 -2 -56 112 -224 448
3 56 -1 -56 56 -56 56
4 70 0 0 0 0 0
5 56 1 56 56 56 56
6 28 2 56 112 224 448
7 8 3 24 72 216 648
8 1 4 4 16 64 256
N = 256 0 512 0 2816

Gigatech Publishing House


Igniting Minds
Engineering Mathematics - III 3.21 Statistics

1 1
We know that, r′ =
N ∑f (x – A) = N ∑ fd .
r r

1
1′ =
N ∑ fd = 0
1 512
2′ =
N ∑ fd = 256 = 2
2

1
3′ =
N ∑ fd = 0
3

1 2816
4′ =
N ∑ fd = 256 = 11.
4

By using the relation between r and r′ we find four moments about the mean are.
r = 0,
2 = 2′ – (1′ )2 = 2 – 0 = 2
3 = 3′ - 31′ 2′ + 2(1′ )3 = 0 – (3) (2) (0) + 2(0) = 0
4 = 4′ - 43′ 1′ + 62′ (1′ )2 - 3(1′ )4
= 11 - (4) (0) (0) + (6) (2) (0) - (3) (0) = 0
μ32 (0)2
Coefficient of skewness = 1 = μ3 = (2)3 = 0
2

μ4 11
Kurtosis = 2 = μ2 = (2)2 = 2.75
2

Example : 4

The first three moments of a distribution about the value 2 of a distribution are 1, 16 and
-40. Find the mean, standard deviation and skewness of the distribution.
Solution : The first three moments about the arbitrary origin 2 are
1′ = 1, 2′ = 16, 3′ = - 40.

1′ =
1 1 
N ∑ f (x – 2) = N ∑ fx – 2 = x – 2

∴ Mean = 
x = μ1′ + 2 = 1 + 2 = 3
2 = 2′ - (1′ )2 = 16 – (1)2 = 15
∴ S.D. = σ = 15 = 3.873
3 = 3′ - 32′ μ1′ + 2 (1′ )3
= - 40 – 3(16) (1) + 2(1)3 = - 86
μ23 (- 86)2
Coefficient of skewness = 1 = μ3 = (15)3 = 2.19.
2

Gigatech Publishing House


Igniting Minds
Engineering Mathematics - III 3.22 Statistics

Example : 5

The first four moments about the working mean 30.2 of a distribution are 0.255, 6.222,
30.211 and 440.25. Calculate the moments about the mean.
Also evaluate skewness (1) and kurtosis (2), and comment upon the skewness and
kurtosis of the distribution.
Solution : The first four moments about the arbitrary origin 30.2 are
1′ = 0.255, 2′ = 6.222, 3′ = 30.211, 4′ = 440.25
1 1
1′ = ∑ f (x – 30.2) = ∑ fx – 30.2 = x - 30.2
N N
∴ Mean = x = μ1′ + 30.2 = 0.255 + 30.2 = 30.455
2 = 2′ - (1′ )2 = 6.222 – (0.255)2 = 6.15698
3 = 3′ - 32′ μ1′ + 2 (1′ )3
= 30.211 - 3(6.222) (0.255) + 2(0.255)3 = 25.48433
4 = 4′ - 43′ 1′ + 62′ (1′ )2 - 3(1′ )4
= 440.25 - 4 (30.211) (0.255) + 6(6.222) (0.255)2 - 3(0.255)4
4 = 411.8496.
μ23 (25.48433)2
Coefficient of skewness = 1 =
μ32 = (6.15698)3 = 2.78255
μ4 411.8496
Kurtosis = 2 =
μ22 = (6.15698)2 = 10.86434.
γ1 = β1 = 1.6681
This indicates considerable skewness of the distribution γ2 = β2 – 3 = 7.86434
This shows that the distribution is leptokurtic. (because β2 > 3).
3.4 Correlation :
If the change in one variable affects a change in the other variable the variables are said
to be correlated and the relation between them is called correlation.
If the two variables deviate in the same direction i.e. If the increase (or decrease) in one
results in a corresponding increase (or decrease) in the other, correlation is said to be direct or
positive.
e.g. The correlation between income and expenditure is positive.
If the two variable in opposite directions i.e. If the increase (or decrease) in one results in
a corresponding decrease (or increase) in the other, correlation is said to be inverse or negative.
e.g. i) the correlation between price and demand is negative

Gigatech Publishing House


Igniting Minds
Engineering Mathematics - III 3.23 Statistics

ii) The correlation between volume and the pressure of a perfect gas is negative. If the
deviation in one variable is followed by a corresponding proportional deviation in
the other is said to be perfect correlation.
3.5 Karl Pearson’s Coefficient of Correlation:
(Or Product Moment Correlation Coefficient):
Correlation coefficient between two variables x and y, usually denoted by r(x, y) or rxy is
a numerical measure of relationship between them and is defined as:

∑(xi - x ) (yi - y)


rxy =
∑(xi - x) ∑(yi - y)
∑XY
=
∑X2 ∑Y2
cov(x,y)
rxy =
σx σy .
1
where cov(x, y) = n ∑x y - x y

1 2 2
σx =
n ∑x - (x) ,
1 2 2
σy = ∑y - (y)
n

x =
1
∑ x, 
1
y = ∑ y. n = no. of values (or entries)
n n
 Correlation formulae:
i) Frequency is not given: put

u = x - a, v = y - b, 
u = n ∑ u, 
1 1
v=n∑v

1 1
n = no. of values (or entries) 2u = n ∑ u2 – (u)2, 2v = n ∑ v2 – (v)2,

1 cov(u,v)
cov(u, v) = n ∑uv - u v , r = σ σ .
u v

ii) Frequency is given: sum of the frequency = N = ∑f and


1 1 1
Put u = x - a, v = y – b, u = n ∑ fu, v = n ∑ fv 2u = n ∑ fu2 – (u)2,

1 1 cov(u,v)
2v = n ∑ fv2 – (v)2, cov(u, v) = n ∑ fuv - u v , r = σ σ .
u v

Gigatech Publishing House


Igniting Minds
Engineering Mathematics - III 3.24 Statistics

Illustrative Examples

Example : 1

Find the coefficient of correlation for the following data:


x 10 14 18 22 26 30
y 18 12 24 6 30 36
Solution : We construct a table as follows :
x y X=x-
x Y=y-
y X2 Y2 XY

10 18 -10 -3 100 9 30
14 12 -6 -9 36 81 54
18 24 -2 3 4 9 -6
22 6 2 -15 4 225 -30
16 30 6 9 36 81 54
30 36 10 15 100 225 150
120 126 0 0 280 630 252

where 
x =
1
∑x =
120
= 20
n 6
 1 120
y = n ∑y = 6 = 21
∑XY 252
r = 2 = = 0.6
∑X ∑Y 280  630
2

Example : 2

Calculate karl-pearson’s coefficient of correlation between price and supply of


commodity from the following data.
Price (Rs.)(x) 17 18 19 20 21 22 23 24 25 26
Supply(kg) (y) 38 37 38 33 32 33 34 29 26 23
Solution : We construct a table as follows:

x y X=x-
x Y=y-
y X2 Y2 XY

17 38 -4.5 5.7 20.25 32.49 -25.65


18 37 -3.5 4.7 12.25 22.09 -16.45

Gigatech Publishing House


Igniting Minds
Engineering Mathematics - III 3.25 Statistics

19 38 -2.5 5.7 6.25 32.49 -14.25


20 33 -1.5 0.7 2.25 0.49 -1.05
21 32 -0.5 -0.3 0.25 0.09 0.15
22 33 0.5 0.7 0.25 0.49 0.35
23 34 1.5 1.7 2.25 2.89 2.55
24 29 2.5 -3.3 6.25 10.89 -8.25
25 26 3.5 -6.3 12.25 39.69 -22.05
26 23 4.5 -9.3 20.25 86.49 -41.85
215 323 0 0 82.5 228.1 126.5
 1 215
n ∑x = 10 = 21.5
where x =


y =
1
∑y =
323
n 10 = 32.3
∑XY - 126.5
r = 2 = = 0.9221485
∑X ∑Y 82.5  228.1
2

Example : 3

Calculate karl pearson’s coefficient of correlation from the following data, taking 100
and 50 as assumed averages of x and y respectively.
x 104 111 104 114 118 117 105 108 106 100 104 105
y 57 55 47 45 45 50 64 63 66 62 69 61
Solution :
We construct a table as follows:
x y X = x - 100 Y = y - 50 X2 Y2 XY
104 57 4 7 16 49 28
111 55 11 5 121 25 55
104 47 4 -3 16 9 -12
114 45 14 -5 196 25 -70
118 45 18 -5 324 25 -90
117 50 17 0 289 0 0
105 64 5 14 25 196 70

Gigatech Publishing House


Igniting Minds
Engineering Mathematics - III 3.26 Statistics

108 63 8 13 64 169 104


106 66 6 16 36 256 96
100 62 0 12 0 144 0
104 69 4 19 16 361 76
105 61 5 11 25 121 55
1296 684 96 84 1128 1380 312
Karl-pearson’s coefficient of correlation is given by
 1 96
x = n ∑x = 12 = 8

 1 84
y = n ∑y = 12 = 7.

cov(x, y) = n ∑ x y -  x
1
y
1
= 12 (312) – (8) (7) = 26 – 56 = - 30

1 2 2
σx =
n ∑x - (x)
1 2
=
12 (1128) - (8) = 94 - 64 = 30 = 5.48
1 2 2
σy = n ∑y - (y)
1 2
=
12 (1380) - (7) = 115 - 49 = 66 = 8.124
cov(u,v) 30 30
r =
σu σv = - (5.48) (8.124) = - 44.52 = - 0.674.

Example : 4

Given n = 6, ∑(x - 18.5) = - 3, ∑(y - 50) = 20, ∑ (x - 18.5)2 = 19,


∑ (y - 50)2 = 850, ∑ (x - 18.5) (y - 50) = -120.
Calculate the coefficient of correlation.
Solution : We have u = x = 18.5, v = y – 50,  u = - 0.5, 
v = 3.33 then,
1 
n uv = u v
r= = - 0.9395
1 u2 - (u)2 1 v2 - (v)2
n  n 

Gigatech Publishing House


Igniting Minds
Engineering Mathematics - III 3.27 Statistics

Example : 5

From a group of 10 students marks obtained by each in papers of mathematics and


applied mechanics are given as:
Marks in
23 28 42 17 26 35 29 37 16 46
Mathematics (x)
Marks in Applied
25 22 38 21 27 39 24 32 18 44
mechanics (y)
Solution : We construct a table as follows:

x y X=x-
x Y=y-
y X2 Y2 XY

23 25 -6.9 -4 47.61 16 27.6


28 22 -1.9 -7 3.61 49 13.3
42 38 12.1 9 146.41 81 108.9
17 21 -12.9 -8 166.41 64 103.2
26 27 -3.9 -2 15.21 4 7.8
35 39 5.1 10 26.01 100 51.0
29 24 -0.9 -5 0.81 25 4.5
37 32 7.1 3 50.41 9 21.3
16 18 -13.9 -11 193.21 121 152.9
46 44 16.1 15 259.21 225 241.5
299 290 00 00 908.9 694 732
Karl-pearson’s coefficient of correlation is given by

x = n ∑x = 10 = 29.9, 
1 299 1 290
y = n ∑y = 10 = 29.
∑XY 732 732
r= 2 = = 794.2 = 0.9217.
∑X ∑Y (908.9)  (694)
2

Example : 6

Calculate the coefficient of correlation from the following data:


x 5 9 15 19 24 28 32
y 7 9 14 21 23 29 30
f 6 9 13 20 16 11 7

Gigatech Publishing House


Igniting Minds
Engineering Mathematics - III 3.28 Statistics

Solution :
We construct a table as follows:
x y f u = x - 19 v = y - 21 fu fv fu2 fv2 fuv
5 7 6 -14 -14 -84 -84 1176 1176 1176
9 9 9 -10 -12 -90 -108 900 1296 1080
15 14 13 -4 -7 -52 -91 208 637 364
19 21 20 0 0 0 0 0 0 0
24 23 16 5 2 80 32 400 64 160
28 29 11 9 8 99 88 891 704 792
32 30 7 13 9 91 63 1183 567 819
82 44 -100 4758 4444 4391

Put u = x - 19, v = y – 21, ∑f = N = 82, 


1 44
u = N ∑fu = 82 = 0.5366,

 1 100
v = N ∑f v = - 82 = - 1.2196
cov(u,v) 54.20
r =
σu σv = - (7.598) (7.26) = 0.9825.

2u = N ∑ fu2 – (
1 1
u)2 = 82 (4758) – (0.5366)2 = 57.7364

∴ σu = 7.598

2v =
1 2 1
N ∑ fv – (v) = 82 (4444) – (- 1.2196) = 52.708
2 2

∴ σv = 7.26
1  1
N ∑fuv - u v = 82 (4391) - (0.5366) (- 1.2096) = 54.20
cov(u, v) =

3.6 Regression :
Regression is the estimation or prediction of unknown values of one variable from
known values of another variable. i.e. One is interested to know the nature of relationship
between the two variables.
 Lines of Regression:
Let the equation of line of regression of y on x by
y = a + bx……..(i) then  y = a + b
x …….(ii) ∴ y - y = b(x - 
x)……..(iii)
The normal equations are ∑y = na + b∑x, ∑xy = a∑x + b∑x2, ……..(iv)

Gigatech Publishing House


Igniting Minds
Engineering Mathematics - III 3.29 Statistics

Shifting the origin to (


xy) then eqn.(iv) becomes
∑(x - 
x) (y - 
y) = a∑(x - x) + b∑(x - x)2……….(v)
Since
∑(x - 
x) (y - 
y)
= r
nσx σy

∑ (x -  0, N ∑ (x - 
1
x) = x)2 = 2x, from equation (v)we get
σy
nrσx σy = a(0) + bn σ2x , ∴ b = r .
σx

Hence from (iii) the line of regression of y on x is y - 


y = byx (x - 
x)
σy
Where byx = r σ . is called the regression coefficient of y on x.
x

Similarly the line of regression of x on y is x - 


x = byx (y - 
y)
σx
Where bxy = r σ . is called the regression coefficient of x on y.
y

σy σx
Now byx. bxy = r . r then r = byx bxy .
σx σy
Note :
i) If r = 0 the two lines of regression becomes x = 
x and y = 
y these are two straight lines
parallel to X and Y axes respectively, and passing through their means 
x and y they are
mutually perpendicular.
ii) If r = + 1 the two lines of regression will coincide.

Illustrative Examples

Example : 1

If be the acute angle between two regression lines in the case of two variables x and y
1 - r2 σx σy
show that tan θ = r . where r, σx, σy have their usual meaning, Explain the
σ2x + σ2y
significance when r = 0 and r = ± 1.
Solution :
The lines of regression are (y - 
y) = byx (x - 
x)……….(1)

(x - 
x) = bxy(x - 
x) ∴ (y -  y) = b (x - 
1
x)………..(2)
xy

Gigatech Publishing House


Igniting Minds
Engineering Mathematics - III 3.30 Statistics

σy 1 σy m2 - m1
Here m1 = r σ and m2 = r σ , now, tan θ = 1 + m m
x x 1 2

1 σy σy σy 2

r σx -r σx 1- r2 σx x 1- r2 σx σy
tan θ = = . 2 = ………..(3)
σy 1 σy r σx + σ2y r σ2x + σ2y
1 + r σ .r σ
x x

i) If r = 0, then there is no relationship between the two variables and they are independent.
π
on putting the value r = 0 in(3) we get tan θ = 0, θ = 2 so the lines (1) and (2) are
perpendicular.
ii) If r = 1 or - 1, on putting these values of r in (3) we get tan θ = 0 or θ = 0 i.e. lines (1)
and (2) coincide. The correlation between the two variables is perfect.

Example : 2

Calculate the coefficient of correlation, obtain the least square regression line y on x for
the following data.
x 1 2 3 4 5 6 7 8 9
y 9 8 10 12 11 13 14 16 15
Also obtain an estimate of y which should correspond on the average to x = 6.2.
Solution : We construct a table as follows:

x y u=x-
x v=y-
y u2 v2 uv

1 9 -4 -3 16 9 12
2 8 -3 -4 9 16 12
3 10 -2 -2 4 4 4
4 12 -1 0 1 0 0
5 11 0 -1 0 1 0
6 13 1 1 1 1 1
7 14 2 2 4 4 4
8 16 3 4 9 16 12
9 15 4 3 16 9 12
45 108 0 0 60 60 57
The coefficient of correlation is given by

x = n ∑x = 9 = 5, 
1 45 1 108
y = n ∑y = 9 = 12.

Gigatech Publishing House


Igniting Minds
Engineering Mathematics - III 3.31 Statistics

∑uv 57
r = = = 0.95.
∑u2 ∑v2 (5)  (12)
1 2 2 1
σu =
n ∑u - (u) =
2
9 (60) - (0) .
1 2 2 1
σu = 2.582. σv = n ∑v - (v) =
2
9 (60) - (0) = 2.582.
σv (0.95)(2.582)
bvu = rσ = = 0.95
u 2.582
The line of regression v on u is v - v = bvu (u - u)
∴ [(y - y) - 0] = 0.95 [(x - x) - 0]
y - 12 = (0.95)(x - 5) ∴ y = 0.95x + 7.25
when x = 6.2 then y = 0.95(6.2) + 7.25 = 13.14. then.

Example : 3

Given 8x - 10y + 66 = 0, 40x - 18y = 214, and variance of x = 9.


Find i) Average value of x and y.
ii) The correlation coefficient between two variables.
iii) The standard deviation of y
Solution :
Since both lines of regression pass through the point (x, y) we have
8x – 10y = - 66
40x - 18y = 214 Solving these two equations we get mean of x and y are
 
x =13, y =17.
8 66 8
The equations of lines of regression can be written as = 10 x + 10, byx = 10.

18 214 18 8 18
x = 40 y + 40 , bxy = 40 then r = byx × bxy =
10 × 40 = 0.6
Given variance of x is σ2x = 9, S.D. = σx = 3. To find, S.D. of y is
σx 8 3
σy = byx r = 10 × 0.6 = 4.

Example : 4

Given the following data: The regression equations are 4x - 5y + 33 = 0.


20x - 9y + 107 = 0, and variance of x = 9. Find
i) Average value of x and y.

Gigatech Publishing House


Igniting Minds
Engineering Mathematics - III 3.32 Statistics

ii) The correlation coefficient between two variables.


iii) The standard deviation of y.
Solution :

x = 30, 
y = 40,
5
byx =
6
8
bxy =
15, r = 0.6
5 3
σy = 6 × 0.6 = 4.17

Example : 5

Given the following data:


Variable (x) Variable (y)
A.M. 8.2 12.4
S.D. 6.2 20
and coefficient of correlation between x and y is 0.9. Find the linear regression,
estimate the value of x given .
Solution :
Since  x = 8.2, 
y = 12.4, σx = 6.2, σy = 20, r = 0.9.
σx 6.2
bxy = r σ = (0.9) × 20 = 0.279.
y

The line of regression x on y is x - 


x = bxy(y - 
y)
(x - 8.2) = (0.279)(y - 12.4)
∴ x = 0.279y + 4.7404.
when y = 10 then x = 7.5304.

Example : 6

Following are the marks of students in a particular subject. Find the average marks.
Marks 10 - 12 12 - 14 14 - 16 16 - 18 18 - 20 20 - 22 22 - 24 24 - 26
Students 3 6 10 15 24 42 75 90

26 - 28 28 - 30 30 - 32 32 - 34 34 - 36 36 - 38 38 - 40 40 - 42
79 55 36 26 19 13 9 7

Gigatech Publishing House


Igniting Minds
Engineering Mathematics - III 3.33 Statistics

Solution :
First prepare the table, where a = 25, h = 2
Mid x-a x - 25
Class Students fx d = x - 25 fd fu = fu
value (x)   2
10 - 12 3 11 33 - 14 - 42 -7 - 21
12 - 14 6 13 78 - 12 - 72 -6 - 36
14 - 16 10 15 150 - 10 - 100 -5 - 50
16 - 18 15 17 255 -8 120 -4 - 60
18 - 20 24 19 456 -6 - 144 -3 - 72
20 - 22 42 21 882 -4 - 168 -2 - 84
22 - 24 75 23 1725 -2 - 150 -1 - 75
24 - 26 90 25 2250 0 0 0 0
26 - 28 79 27 2133 2 158 1 79
28 - 30 55 29 1595 4 220 2 110
30 - 32 36 31 1116 6 216 3 108
32 - 34 21 33 858 8 208 4 104
34 - 36 19 35 665 10 190 5 95
36 - 38 13 37 481 12 156 6 78
38 - 40 9 39 351 14 126 7 63
40 - 42 7 41 287 16 112 8 295
 509 - 13315 - 590 - 295

  fx
i) Direct method, mean = x=
f
13315
=
509 = 26.16
  fd 590
ii) Short wt method, mean = x=a+ = 25 + = 26.16
f 509

iii) Step deviation method, mean = 


x=a+h 
 fu
 f
25 + 2 509  = 26.16
295
=
 

Gigatech Publishing House


Igniting Minds
Engineering Mathematics - III 3.34 Statistics

Example : 7

Following are the monthly wages of workers. Find the average monthly wages.
Wages 0 - 10 10 - 20 20 - 30 30 -40 40 - 50 50 - 60 60 – 70
Workers 5 12 30 45 50 37 21
Solution :
Work- Midpo x - 35
Wages fx d = x - 35 fd u= fu
ers (f) int (x) 10
0 - 10 5 5 25 - 30 - 150 -3 - 15

10 - 20 12 15 180 - 20 - 240 -2 - 24

20 - 30 30 25 750 - 10 - 300 -1 - 30

30 - 40 45 35 1575 0 0 0 0

40 - 50 50 45 2250 10 500 1 50

50 - 60 37 55 2035 20 740 2 74

60 - 70 21 65 1365 30 630 3 63

 200 - 8180 - 1180 - 118

  fx 8180
i) Direct method: mean = x= = = 40.9
 f 200
  fd 1180
ii) Short wt method: mean = x=a+ = 35 + 200 = 40.9
f
 fu
iii) Step deviation method: mean = x = a + h  

 f
35 + 10 200  = 40.9
118
=
 
Example : 8

Calculate the mean for following data by


i) Direct method
ii) Shortcut method
iii) Step deviation method
Weights of students (x) 15 20 25 30 35 40 45 50 55
Students (f) 2 22 19 14 3 4 6 1 1
Gigatech Publishing House
Igniting Minds
Engineering Mathematics - III 3.35 Statistics

Solution :
f = 72,  fx = 2185,  fd = - 515,  fu = -101
Where h = 5, d = x – a = x – 35
  fx 2185
i) x= = 72 = 30.34
f
  fd - 515
ii) x=a+ = 35 + = 30.34
f 72
 fu
iii)  a + h   = 35 +  72  5 = 30.34
x=
- 101
 f  
Example : 9

Calculate the mean for the following data.

Weight of articles 0 - 10 10 - 20 20 - 30 30 - 40 40 - 50 50 - 60

No. of articles 14 17 22 26 23 16

Solution :
f = 120, fx = , fd = 210, fu = 162
 
x = mean by all three methods = 31.75
3.7 Example of Median, Quartiles :
 Median :
If the given values of a distribution are arranged in ascending or descending order of its
magnitude then
i) Medium = Middle item, if no. of distributions are odd
= Mean value of middle two values, if distributions are even.

Illustrative Examples

Example : 1

Find the median value for the following distribution


2, 9, 17, 6, 5, 27, 8, 35, 20, 22, 40.
Solution :
Arrange the numbers in ascending order
2, 5, 6, 8, 9, 17, 20, 22, 27, 35, 40.
 Median = 17
Gigatech Publishing House
Igniting Minds
Engineering Mathematics - III 3.36 Statistics

Example : 2

Find the median value for the following distribution


2, 9, 17, 6, 5, 27, 8, 38, 20, 35, 20, 22, 40, 24.
Solution :
Arrange the numbers in ascending order
2, 5, 6, 8, 9, 17, 20, 20, 22, 24, 27, 35, 38, 40.
This distribution has 14 observations, even no of observations
20 + 20
 Median = 2
= 20
3.8 Quartiles :
Quartiles divides the frequency distribution into four equal parts after arranging values/
observations in ascending order.
Q1 = Lower quartile (between lower value and Q2)
Q2 = Median
Q3 = Upper quartile (between Q2 and upper extreme value)

Formulae

1. N = f
N - C
 4  
2. Q1 = L+ f h
 
N - C
 2  
3. Q2 = L+ f h
 
3N - C 
 4  
4. Q3 = L+ h
 f 
Where l = lower limit of class interval in which Q1| Q2 | Q3 lies
f = frequency of that respective class internal.
h = width of that respective class interval
c = cumulative frequency of the class preceding the class in which
Q1|Q2|Q3 lies.

Gigatech Publishing House


Igniting Minds
Engineering Mathematics - III 3.37 Statistics

Illustrative Examples

Example : 1

Calculate the median, lower and upper quartiles from the following data.
Class 10 - 20 20 - 30 30 - 40 40 - 50 50 – 60
Frequency 7 6 9 2 6
Solution :
Given frequency distribution table is as below.
Class Frequency (f) Cumulative frequency (c)
10 - 20 7 7
20 - 30 6 13
30 - 40 9 22
40 - 50 2 24
50 - 60 6 30
 f = N = 30
To find Q1 :
N = 30
N 30

4 = 4 = 7.5 which lies in 20 – 30
 L = 20, f = 6, c = cf = preceding frequency = 7
N - C
 4  
L +  f  h = 20 +  6  (10) = 14.16
7.5 - 7
 Q1 =
   
 Q1 = 14.16
To find Q2 :
N 30
2 = 15 which lies between 30 – 40
=
2
 L = 30, f = 9, c = 13
N - C
 2  
 Q2 = L+ f h
 
30 +  9  (10) = 32.22
15 - 13
=
 
 Q2 = 32.22
Gigatech Publishing House
Igniting Minds
Engineering Mathematics - III 3.38 Statistics

To find Q3 :
3N 90
4 = 22.5, which lies in 40 – 50
=
4
 L = 40, f = 2, c = 22
3N - C
 4  
 h = 40 + 
22.5 - 22
 Q3 = L+ = 42.5
 f   2 
 Q3 = 42.5

Example : 2

Calculate the median, lower and upper quartiles from the following distribution
Class 5 - 10 10 - 15 15 - 20 20 - 25 25 - 30 30 - 35 35 - 40 40 – 45
Frequency 5 6 15 10 5 4 2 2
Solution :
Given distribution table can be written as
Class 5 - 10 10 - 15 15 - 20 20 - 25 25 - 30 30 - 35 35 - 40 40 – 45
Frequency 5 6 15 10 5 4 2 2
Cumulative
5 11 26 36 41 45 47 49
Frequency
N =  f = 49
N 49
i) = = 12.25, which lies in 15 – 20
4 4
 L = 15, f = 15, c = 11
N - C 49 - 11
 4    4  
 Q1 = L +  f  h = 15 + 15  (5) = 15.4
   
 Q1 = 15.4
N 49
ii)
2 = 2 = 24.5, which lies in 15 - 20
 L = 15, f = 15, c = 11
N - C
 2  
 Q2 = L+ f h
 
15 +  15  (5) = 19.5
24.5 - 11
=
 
 Q2 = 19.5

Gigatech Publishing House


Igniting Minds
Engineering Mathematics - III 3.39 Statistics

3N (3)(49)
4 = 36.75, which lies in 25 – 30
iii)
4 =
 L = 25, f = 5, c = 36
3N - C
 4  
 h = 25 + 
36.75 - 36
 Q3 = L+ (5) = 25.75
 f   5 
 Q3 = 25.75

Exercise No. 1

Ex. 1 Calculate the median, lower and upper quartiles from the following data
Class 10 - 20 20 - 30 30 - 40 40 - 50 50 – 60
Frequency 8 7 10 3 7
Ex. 2 Calculate the median, lower and upper quartiles from the following table
Class 3-8 8 - 13 13 - 18 18 - 23 23 - 27 27 - 32 32 - 37 37 - 42
Frequency 12 88 58 17 23 29 18 5
Ex. 3 Calculate the median, lower and upper quartiles from the following table
Weight 70-80 80-90 90-100 100-110 110-120 120-130 130-140 140-150

No. of
12 18 35 42 50 45 20 08
Persons

3.9 Examples on Standard Devitation :


Example : 1

Calculate the mean and standard deviation for the following


Size of item 6 7 8 9 10 11 12
Frequency 3 6 9 13 8 5 4
Solution :
First we prepare the table let a = 9
Size (x) Frequency (f) d=x-9 fd d2 fd2 fx
6 3 -3 -9 9 27 18
7 6 -2 - 12 4 24 42
8 9 -1 -9 1 9 72
9 13 0 0 0 0 117

Gigatech Publishing House


Igniting Minds
Engineering Mathematics - III 3.40 Statistics

10 8 1 8 1 8 80
11 5 2 10 4 20 55
12 4 3 12 9 36 48
 48 - 00 - 124 432
 fd
Mean = 
0
i) x=a= =9+ =9
f 48
 fx 432
ii) Mean = 
x= = =9
 f 48
 fd2 (fd)2 2
124 - (0) 2 = 1.607
iii) S. D. =  = - =
 f ( f)2  48  (48)
Example : 2

Calculate the standard deviation for the following data


Marks 0 - 10 10 - 20 20 - 30 30 - 40 40 - 50 50 - 60
Students 3 16 26 31 16 8
x-a
Solution : First prepare the table with a = 35, h = 10, d = x – a, u = h

Frequencie Midpoint x - 35
Class u=
10 fu u2 fu2
s(f) (x)
0 - 10 3 5 -3 -9 9 27
10 - 20 16 15 -2 - 32 4 64
20 - 30 26 25 -1 - 26 1 26
30 - 40 31 35 0 0 0 0
40 - 50 16 45 1 16 1 16
50 - 60 8 55 2 16 4 32
 100 - - - 35 - 165

We have
2
 fu2 fu
S. D. = =h - 
 f   f
2
165 - 35
= 10
100 -  100  = 12.35

Gigatech Publishing House


Igniting Minds
Engineering Mathematics - III 3.41 Statistics

Example : 3

Find the standard deviation for the following data


Age under 10 20 30 40 50 60 70 80
No. of boys 15 30 53 75 100 110 115 125
Solution :
First we should prepare table by grouping no. of boys in the interval 10 – 20, 20 – 30, ---
Boys x - 35
Age Midpoint (x) u=
10 fu u2 fu2
(f)

0 - 10 15 5 -3 - 45 9 135

10 - 20 15 15 -2 - 30 4 60

20 - 30 23 25 -1 - 23 1 23

30 - 40 22 35 0 0 0 0

40 - 50 25 45 1 25 1 25

50 - 60 10 55 2 20 4 40

60 - 70 5 65 3 15 9 45

70 - 80 10 75 4 40 16 160

 125 - - 2 - 488
2 2
 fu2 fu 488  2 
  = h -   = 10
 f   f 125 - 125 = 19.76

Example : 4

Calculate the standard deviation of scores of a player in 10 matches played by him.


47, 12, 16, 42, 4, 51, 37, 48, 13, 0
Solution : In this example, neither frequencies nor intervals are given. This is an ungrouped or
raw data

Here 
47 + 12 + 16 + 42 + 4 + 51 + 37 + 48 + 13 + 0 270
x = = 10 = 27
10
Prepare a table
x 47 12 16 42 4 51 37 48 13 0  270
d = x - 27 20 - 15 - 11 15 - 23 24 10 21 - 14 - 27 0
2
d 400 225 121 225 529 576 100 441 196 729 3542

Gigatech Publishing House


Igniting Minds
Engineering Mathematics - III 3.42 Statistics

2 2
 d2 d 3542 -  0  = 18.82
  = S. D.=
n -n =  10  10
Example : 5

Calculate the standard deviation of the following distribution


x 5 6 7 8 9 10 11 12 13 14 15
f 18 15 34 47 68 90 80 62 35 27 11
Solution :
Let d = x – 10,  f = 487,  fd = 36,  fd2 = 2602
2 2
 fd2  fd 2602  36 
  = -  =
 f  f 487 - 487 = 2.311

Example : 6

Calculate the standard deviation of scores made by a batsman in 10 matches as below.


12, 115, 6, 73, 7, 19, 119, 36, 84, 29.
Solution :

x = 50, d = x – 51,  d = - 10,  d2 = 17508, n = 10
2 2
 d2  d 17508 - 10
  =
n - n  = 10 -  10  = 41.8
Example : 7

Calculate the standard deviation of scores made by a batsmen in 10 matches as below


47, 12, 16, 42, 4, 51, 37, 48, 13, 0.
Solution :
In this example, neither frequencies nor intervals are given. This is a ungrouped or raw data

Here 
47 + 12 + 16 + 42 + 4 + 51 + 37 + 48 + 13 + 0 270
x = = 10 = 27
10
Prepare a table
x 47 12 16 42 4 51 37 48 13 0  270
d = x - 27 20 - 15 - 11 15 - 23 24 10 21 - 14 - 27 0
d2 400 225 121 225 529 576 100 441 196 729 3542
2 2
 d2 d 3542 -  0  = 18.82
  = S. D.=
n -n =  10  10

Gigatech Publishing House


Igniting Minds
Engineering Mathematics - III 3.43 Statistics

Exercise No. 2

Ex. 1 Calculate the standard deviation and mean for following data.
Marks 0 - 10 10 - 20 20 - 30 30 - 40 40 - 50 50 60 60 – 70
Students 5 12 30 45 50 37 21
x - 35
[Hint : N =  f = 200, a = 35, u = 10 ,  fu = 118,  fu2 = 510

 
x = 40.9,  = S. D = 10.839]
Ex. 2 The annual salaries of a group of employees are given below.
Salaries (1000) 45 50 55 60 65 70 75 80
Persons 3 5 8 7 9 7 4 7
x - 60
[Hint: N =  f = 50, a = 60, h = 5, u = ,  fu = 36,  fu2 = 240
5
2
 fu2 fu
  = h -   = 10.35]
 f   f
Ex. 3 Calculate the standard deviation for the following data
Class 100-109 110-119 120-129 130-139 140-149 150-159 160-169 170-179

Frequency 15 44 133 150 125 82 35 16

[Hint: Take a = 145, h = 10,  f = 600,  fu = - 408,  fu2 = 1684


2
 fx2  fu 
 S. D =  = h - = 15.28]
 f  f
Ex. 4 Calculate the standard deviation for the following data.
Weights 0 - 10 11 - 20 21 - 30 31 - 40 41 - 50 51 – 60
Students 3 16 26 31 16 8
x - 35
[Hint: Take a = 35, h = 10, u = 10 ,  fu = - 35,  fu2 = 165,  f = 100

  = 12.35]
Ex. 5 Calculate the standard deviation for the following data
Heights 20 - 25 25 - 30 30 - 35 35 - 40 40 - 45 45 – 50
No. of Girls 170 110 80 45 40 35
[Hint: N =  f = 480, a = 35.5, h = 5,  fu = - 220,  fu = 1310  2
 = 7.936]

Gigatech Publishing House


Igniting Minds
Engineering Mathematics - III 3.44 Statistics

Example : 9

Annual salaries of a group of employees are given below.


Salaries (1000) 45 50 55 60 65 70 75 80

Employees 2 2 1 30 12 1 1 1
x - 60
5 ,  f = N = 50,  fu = 10,  fu = 68
2
[Hint: a = 60, h = 5, u =
2 2
 fx2 fu 68 10
  =h -  =5 - = 5 1.32 = 5.74
 f   f 50 50
3.10 Examples on Coefficient of Variation :
If  and 
x respectively denotes the standard deviation and mean of data A then

coefficient of variation of A = cov (A) =  100

x
Note :
1. If. (mean of data A) > (mean of data B)
Then team A is more run taker
2. If. cov (A) > cov (B)
Then i) Group B is more consistent
ii) Group A has more variability

Illustrative Examples

Example : 1

Two brands of types are tested with following results of their life in kilometers
Life in kms (000) 20 - 25 25 - 30 30 - 35 35 - 40 40 – 45
Brand A 1 22 64 10 3
Brand B 0 24 76 0 0
Determine:
(i) Which brand tyres have greater average life 
(ii) Which brand tyres will be preferred for use
Solution :
First we should find mean and S. D. fir both brands
Brand A :

Gigatech Publishing House


Igniting Minds
Engineering Mathematics - III 3.45 Statistics

Frequency Midpoint x - 32.5


Life u=
5 fu u2 fu2
(f) (x)
20 - 25 1 22.5 -2 -2 4 4
25 - 30 22 27.5 -1 - 22 1 22
30 - 35 64 32.5 0 0 0 0
35 - 40 10 37.5 1 10 1 10
40 - 45 3 42.5 2 06 4 12
- 100 - - -8 - 48

 a+
 fu h = 32.5 +  - 8  5 = 32.1
xA =  100
 f
2
 fu2 fu
A = h - 
 f   f
2
= 5  48  -  - 8  = 3.441
100 100
A 3.441
 cov (A) =  100 = 32.1  100 = 10.72

x A

For Brand B :
Frequency Midpoint x - 32.5
Life u=
5 fu u2 fu2
(f) (x)
20 - 25 0 22.5 -2 0 4 0
25 - 30 24 27.5 -1 - 24 1 24
30 - 35 76 32.5 0 0 0 0
35 - 40 0 37.5 1 0 1 0
40 - 45 0 42.5 2 0 4 0
- 100 - - - 24 0 24
 fu
 a +   h = 32.5 +  100 5 = 31.3
- 24
 xB =
 f
2
 fu2 fu
B = h - 
 f   f
2
= 5  24  -  - 24 = 2.136
100  100 
Gigatech Publishing House
Igniting Minds
Engineering Mathematics - III 3.46 Statistics

B 2.136
 cov (B) =  100 = 31.3  100 = 6.824

x B

Conclusion :
i)  xA > 
x B  Brand A tyres have more life
ii) cov (A) > cov (B)  Brand B tyres will be preferred to a
Example : 2

Goals scored by two teams in a hockey match session are as below:


Calculate the coefficient of variation and hence state, which team is more consistent.
Goals 0 1 2 3 4 5

Played by A 15 12 07 06 05 03

Played by B 18 12 06 03 02 01

Solution :

For team A, 
15 + 12 + 7 + 6 + 5 + 3 48
xA = 6 = 6 =8

For team B, 
18 + 12 + 6 + 3 + 2 + 1 42
xB = = =7
6 6
Table
A (x) d=x-8 d2 B (y) D=y-7 D2
15 7 49 18 11 121
12 4 16 12 5 25
7 -1 1 6 -1 1
6 -2 4 3 -4 16
5 -3 9 2 -5 25
3 -5 25 1 -6 36
48 0 104 42 0 224

Team A : 
xA = 8
2
 D2 D 104 - 0 = 13
A =
N -N =  48  6 = 2.1666 = 1.471
A 1.471

xA  100 = 8  100 = 18.39
cov (A) =

Team B :
Gigatech Publishing House
Igniting Minds
Engineering Mathematics - III 3.47 Statistics


xB = 7
2
 D2 D 224 - 0 = 112
B =
N -N =  42  21 = 5.333 = 2.309
B 2.309
 cov (B) =  100 = 7  100 = 32.99

x B

 cov (A) < cov (B)  team A is more consistent.


Example : 3

Goals scored by two teams in a football match session were as follows.


Goals scored 0 1 2 3 4 5

Team A Played 15 10 07 05 03 02

Team B Played 20 10 05 04 02 01
Calculate the coefficient of variation and state which team is more consistent 
Solution :
Hint : 
x = 7,  x = 7,  = 4.43,  = 6.48
A B A B

 cov (A)= 63.29, cov (B) = 92.57


 Team B has more variability.
 Team A is more consistent.

Example : 4

Following are scores of two batsmen A and B in a series of innings


A 12 115 6 73 19 119 29 84 36 7
B 47 12 16 42 51 37 0 13 48 4
State (i) Which player is more run taker
(iii) Which player is more consistent
Solution :
Hints: 
xA =
12 + 115 + 6 + 73 + 19 + 119 + 29 + 84 + 36 + 7 500
= 10 = 50
10

xB =
47 + 12 + 16 + 42 + 51 + 37 + 0 + 13 + 48 + 4 270
= = 27
10 10
2
 d2 d
 A =
N -  N  = 41.8
2
 d2 d
 B = N -  N  = 18.82
Gigatech Publishing House
Igniting Minds
Engineering Mathematics - III 3.48 Statistics

cov (A) = 83.6, cov (B) = 69.6


 cov (A) < cov (B)  B is more consistent.

xB < 
x  A is more run taker.
A

Example : 5

Find standard deviation and coefficient of variance for the following data.
Marks 0 - 10 10 - 20 20 - 30 30 - 40 40 - 50 50 - 60 60 - 70 70 – 80
Students 12 18 35 42 50 45 20 08

f = 230, let a = 40, u = 10 , 


x - 40
Solution : [N = x = 40.43

 = 17.26
17.26 
 cov (A) = 40.43  100 =   100 = 42.69]
x
Example : 6

Goals scored by team A and B in a football season were as below


Goals 0 1 2 3 4
Team A 27 9 8 5 4
Team B 17 9 6 5 3
Which team was more consistent 
Solution : Team A : xA = 1.0566, A = 1.309, cov (A) = 123.90

Team B : x = 1.2,  = 1.307, cov (B) = 108.97
B B

 Team B is more consistent.


Example : 7

Scores obtained by two batsmen A and B in 10 matches are given below. State, which batsman
has good average of runs and which player is more consistent.
A 30 66 60 34 20 38 44 62 80 46
B 34 70 55 48 45 30 46 38 60 34
Solution :
Player A : xA = 48, A = 17.776, cov (A) = 37.03
Player B : xB = 46, B = 12.107, cov (B) = 26.32
(i) Batsman A is more run taker
(ii) Player B is more consistent.
Gigatech Publishing House
Igniting Minds
Engineering Mathematics - III 3.49 Statistics

Example : 8

Following table gives the marks obtained in a paper of mathematics out of 50 by the
students of D2 divisions A and B.
Class 0-5 5 - 10 10 - 15 15 - 20 20 - 25 25 - 30 30 - 35 30 - 40 40 - 45 45 - 50

A 2 6 8 8 15 18 12 11 9 4

B 3 5 7 9 12 16 11 5 6 2
State, which Batch has more variability
Solution :
Division A: 
x = 26.854,  = 11.173, cov (A) = 41.60
A A

Division B: 
x B = 24.934, B = 10.927, cov (B) = 43.824
 Team B has greater variability and team A is more consistent.
Example : 9

Find the mean, standard deviation and coefficient of variation for the following data.
Marks obtained up to 10 20 30 40 50 60 70 80

No. of Students 12 30 65 107 157 202 222 230


Solution :
First prepare the table by using class intervals such as marks obtained by students in 0 –
10, 10 – 20, 20 – 30, 30 – 40, --- as below
Marks 0 - 10 10 - 20 20 -30 30 - 40 40 - 50 50 – 60 60 - 70 70 – 80
Students 12 18 35 42 50 45 20 08

 mean = 
x = 40: 43,  = 17.26, cov (A) = 42.69
3.11 Examples on Combined Mean :
Let 
x1 and 1 be the mean and standard deviation of a sample size of n1 and 
x2, 2 be the
mean and standard deviation of a sample of size n2.
Then
n1
x1 + n2
i) 
x2
x12 = combined mean of two samples =
n1 + n2
n1
x1 + n2
x2 + n3
ii) 
x2
x13 =
n +n +n 1 2 3
2 2 2 2
n1 1 + n2 2 + n1d 1 + n2d 2
iii) 12 = combined standard deviation =
n1 + n2
Where d1 = | 
x1 – 
x12 |, d2 = | 
x2 – 
x 12 |
Gigatech Publishing House
Igniting Minds
Engineering Mathematics - III 3.50 Statistics

Example : 1

An analysis of monthly wages paid to workers is two firms A and B, belonging to same
domain of production, is as below.
A B
No. of Workers 500 600
Average salary 3000 3500
S. D. of distribution of wages 88 120
Determine :
i) Which firm pays larger on salaries
ii) Which firm has more consistency
iii) Which firm has greater variability
iv) What is average of salary if A and B taken together 
v) What is S. D. of individual worker if A and B taken together 
Solution :
i) Total monthly payment of A = 500  3000 = 15.00000
Total monthly payment of B = 600  3500 = 21,00,000
Firm B pays larger amount on salaries.
ii) xA = 3000, xB = 3500, A = 88, B = 120
A 88
cov (A) = x  100 = 3000  100 = 2.93
A

B 120
cov (B) =  100 =  100 = 3.42
xB 3500
cov (A) < cov (B)
Firm B has greater variability and
Firm A is more consistent
iii) Average salary = combined mean of (A and B)
n1x1 + n2
x2
x12 =
n +n 1 2

500(3000) + 600(3500)
=
500 + 600
1500000 + 2100000 3600000
= = 1100 = 3272.70  3273
1100

iv) d1 = | 
x1 – 
x 12 | = | 3000 – 3273 | = 273
d =|x - 
2

2 x | = | 3500 – 3273 | = 227
12

Gigatech Publishing House


Igniting Minds
Engineering Mathematics - III 3.51 Statistics

2 2 2 2
n1 1 + n2 2 + n1 1 + n2 2
Combined S. D = 12 =
n1 + n2
(560) (88)2 + (600) (120)2 + (500) (273)2 + 600(227)2
=
500 + 600
(500  7744) + (600  14400) + (500) (74529) + (600) (51529)
=
1100
80693900
=
1100
= 73358.09
= 270.84
Example : 2

The mean and variance of two villages are given below


Village x Village y
No. of persons 500 600
Average income 186 175
Variance of income 81 100
i) In which village the variation in income is greater
ii) What is the combined S. D. it x, y are taken together
Solution :
i) Given 
x = 186, N1 = 500, 
y = 175, N2 = 600
x2 = 81, x = , y2 = 100, y = 10
We know 2 = variance and variance =  = S. D
x 9
 cov (x) =  100 = 186  100 = 4.839

x
y
cov (y) =  100

y
10
175  100 = 5.714
=

 cov (x) < cov (y)


 Variation in income of village y is greater.

 n1
x 1 + n2
x 2 (500) (186) + (600) (175)
ii) x12 = n +n = = 180
1 2 500 + 600

Gigatech Publishing House


Igniting Minds
Engineering Mathematics - III 3.52 Statistics

d1 = | 
x – 180 | = | 175 – 180 | = 5
d =|
2 y - 180 | = | 186 – 180 | = 6
2 2 2 2
n1 1 + n2 2 + n1 1 + n2 2
Combined S. D = 12 =
n1 + n2
60000 + 40500 + 15000 + 18000
= 1100
= 11.02
Example : 3

An analysis of wages paid to workers in 02 firms A and B is as below


A B
No. of workers 550 650
Average monthly salary Rs 1450 Rs 1400
S. D. of distribution 100 140
Calculate the monthly average wages and combined S. D. by taking two firms A and B
together.
Solution :
N1 = 550, N2 = 650, 
xA = 1450, 
xB = 1400

 n1
x1 + n2x2
Combined mean = x12 =
n1 + n2
= 1422.92
d1 = 27. 08, d2 = 22.92
2 2 2 2
n1d 1 + n2d 2 + n1 1 + n2 2
 12 = n1 + n2
= 12578
Example : 4

The number of employees wages per employee and variance of the wages per employee
are given below for two factories A and B.
A B
No. of employees 100 150
Average wages of employee 3200 2800
Variance of wages 625 729

Gigatech Publishing House


Igniting Minds
Engineering Mathematics - III 3.53 Statistics

i) Which factory have greater variation in wages


ii) What is average salary/wages if A and B are taken together.
Solution :
N1 = 100, N2 = 150, 
x 1= 3200, 
x2 = 2800, 1 = 25, 2 = 27
25 27
i) cov (A) = 3200  100 = 0.781, cov (B) = 2800  100 = 0.964

 Factory B has greater variation in monthly wages


n1
x 1 + n2 
combined mean = 
x2 320000 + 420000
ii) x12 = =
n1 + n2 250
740000
=
250 = 2960
3.12 Formulae on Moments and Examples :
The arithmetic mean of variance power of the deviations (xi - 
x) is called the moment of
the distribution.
I) rth moment of variable x about mean x

r = N  (xi - 
1
= x)r , for ungrouped data

fi (xi - 
1
= x)r , for frequency distribution
N
II) rth moment about any value ‶a″
1
= r = N  fi dir, di = xi – a

r 1 xi - a
= h N  fi uir, where ui = h

Note :
 (xi - 
x)r fi (xi - 
x)r
1. r = N = N

fi (xi - 
x)o fi fi
2. o = = N = = 1 o = 1
N fi

fi (xi - 
x)1 fixi fi 
x
3. 1 = = N - N
N
fixi fi   N   
=
N -  N  x = x - N x = x - x = 0
1 = 0
4. o = 1
Gigatech Publishing House
Igniting Minds
Engineering Mathematics - III 3.54 Statistics

III) relations between r and r :


a) o = 1
b) 1 = 0
c) 2 = ʹ2 .. (ʹ1)2
d) 3 = ʹ3– 3 2 ʹ1 + 2(ʹ1)3
e) 4 = ʹ4– 4 ʹ3 ʹ1 + 6ʹ2(ʹ1)2 – 3 (ʹ1)4
IV) Relations between ʹ1 and r :
a) ʹo = 1

b) ʹ1 = 
x–A
c) ʹ2 = 2 + (ʹ1)2
d) ʹ3 = 3 + 32 ʹ1 + ʹ13
e) ʹ4 = 4 + 3 ʹ1 + 62 (ʹ1)2 + (ʹ1)4

Illustrative Examples

Example : 1

The first 04 moments about ‶4″ of variables are – 1.5, 17, - 30, 108. Find the first 04
moments about mean and hence  ,  , 
x, , variance
1 2

Solution :
Given a = 4, ʹ1 = - 1.5, ʹ2 = 17, ʹ3 = - 30, ʹ4 = 108
1 = 0
2 = ʹ2 - (ʹ1)2 = 17 – (- 1.5)2 = 14.75
3 = ʹ3– 3 ʹ2 ʹ1 + 2(ʹ1)3 = 39.75
4 = ʹ4– 4 ʹ3 ʹ1 + 6ʹ2(ʹ1)2 – 3 (ʹ1)4 = 142.3125
32 (39.75)2
1 = 3 =
2 (14.75)3 = 0.4926
4 142.31
2 = = 2 = 0.6543
22 (14.75)

x = a + ʹ1 = 4 + ( - 1.5) = 2.5
 = 2 = 14.75 = 3.84
Variance =  = 2 = 14.75
2

Gigatech Publishing House


Igniting Minds
Engineering Mathematics - III 3.55 Statistics

Example : 2

The first 04 moments about working mean ‶44.5″ of a distribution are – 0.4, 2.99, - 0.08,
27.63. Calculate the moments about mean, 1, 2, mean, S. D. variance.
Solution :
1 = 0, 2 = 2.83, 3 =3.38, 4 =30.295, 1 = 0.504, 2 = 3.782,
Mean 
x = a + ʹ1 = 44.5 + (- 0.4) = 44.1, 2 = variance = 2 = 2.83,  = 1.682

Example : 3

Find coefficient of skewness, kurtosis if the first 04 moment about, ‶5″ are 2, 20, 40, 50.
And hence find, 
x , , variance.
Solution :
1 = 0, 2 = 16, 3 = 64, 4 = 162, 
x = 7,  = 2 = 4
Variance =  = 4, 1 = 1, 2 = 0.6328
2

Example : 4

Find 1, 2, 3, 4, 1, 2, , 2, x if


a = 2, ʹ1 = 1, ʹ2 = 2.5, ʹ3 = 5.5, ʹ4 = 16

Solution : 
x = a + ʹ1 = 3, 1 = 0, 2 = 1.5, 3 = 0, 4 = 6
32 0
 = 2 = 1.5 = 1.224, 2 = 1.5, 1 = = 3 =0
23 (7.5)
4 6 6
2 = = 2 =
22 (1.5) 2.25 = 2.66

Example : 5

The first 03 moments about ‶2″ of a distribution are 1, 16, - 40. Find first 03 moments
about mean and hence find 
x, ,  , 2
1

Solution : 1 = 0, 2 = 15, 3 = - 86, 


x = 3,  = 15 = 3.873, 1 = 2.192, 2 = 15.
Example : 6

The first 04 moments about working mean ‶30.5″ of a distribution are 0.0375, 0.4546,
0.0609, 0.5074. Calculate the moments about mean and hence ,  ,  , 
1x, .2

Solution :
1 = 0, 2 = 0.3139, 3 = 0.0098, 4 = 0.3033, 1 = 1.1143  10-4
2 = 0.3349, x = 30.5 + 0.0375 = 30.5375,  = 0.3139 = 0.5602.

Gigatech Publishing House


Igniting Minds
Engineering Mathematics - III 3.56 Statistics

Example : 7

The first 04 moments about the working mean (44.5) of a distribution are – 0.4, 2.99, -
0.08 and 27.63. Calculate the moments about mean.  ,  , 
x. 1 2

Solution : a = 44.5, 1 = 0, 2 = 2.83, 3 = 3.38, 4 = 30.295, 1 = 0.504


2 = 3.782, 
x = 44.5 + (- 0.4) = 44.1
Example : 8

Find the coefficient of skewness, kurtosis about a point ‶48″ if d = x – 48, f = 100,
fd = 50, fd2 = 1970, fd3 = 2948 fd4 = 86752.
Solution :
fd
We know (i) ʹ1 = = 0.5
f
fd2
(ii) ʹ2 = = 19.7
f
fd3
(iii) ʹ3 = = 29.48
f
fd4
(iv) ʹ4 = = 867.52
f
1 = 0, 2 = 19.45, 3 = 0.18, 4 = 837.92,
32 (0.18)2 0.0324
1 = = 3= = 4.40  10-6
23 (19.45) 7357.98
4 837.92
2 = = 2 = 2.21
22 (19.45)
Example : 9

Calculate the first 04 moments about the mean and also find the values of 1, 2. From the
following distribution.
Marks 0 - 10 10 - 20 20 - 30 30 - 40 40 - 50 50 - 60 60 – 70
Students 8 12 20 30 15 10 5
Solution :
First we prepare the table with assumed mean a = 35
Midpoin
Marks f u fu u2 fu2 u3 fu3 u4 fu4
t (x)
0 - 10 5 8 -3 - 24 9 72 - 27 - 216 81 648
10 - 20 15 12 -2 - 24 4 48 -8 - 96 16 192

Gigatech Publishing House


Igniting Minds
Engineering Mathematics - III 3.57 Statistics

20 - 30 25 20 -1 - 20 1 20 -1 - 20 1 20
30 - 40 35 30 0 0 0 0 0 0 0 0
40 - 50 45 15 1 15 1 15 1 15 1 15
50 - 60 55 10 2 20 4 40 8 80 16 160
60 - 70 65 05 3 15 9 45 27 135 81 405
 - 100 - - 18 - 240 - - 102 - 1440
fu - 18
 ʹ1 = h =  10 = - 18
f 100
fu  = 240,
2
2
ʹ2 =  f 
h
 
fu  = - 1020,
3
3
ʹ3 =  f 
h
 
fu  = 14400
4
4
ʹ4 =  f 
h
 
1 = 0, 2 = 236.76, 3 = 264.336, 4 = 141290.11,
1 = 0.005, 2 = 2.52.
Example : 10

From the following data, calculate the moments about (i) assumed mean ‶25″ (ii) actual mean.
Class 0 - 10 10 - 20 20 - 30 30 - 40
Frequency 1 3 4 2
Solution :
x - 25
Prepare a table with h = 10, a = 25, u = 10 , f = 10,

fu= - 3, f u2 = 9, fu3 = - 9, fu4 = 21


fu 2 fu 
ʹ1 =   h = - 3, ʹ2 = h 
2
 = 90,
 f   f 
ʹ3 = - 900, ʹ4 = 21000
1 = 0, 2 = 81, 3 = - 144, 4 = 14817.
Example : 11

From the following frequency distribution table calculate


1, 2, 3, 4, , 1, 2.
x 2 3 4 5 6

Gigatech Publishing House


Igniting Minds
Engineering Mathematics - III 3.58 Statistics

f 1 3 7 3 1
Solution :
 2 + 9 + 28 + 25 + 6 fu 60
We have x = = = = 4 = actual mean
15 f 15
Prepare a table
x f u=x-4 fu u2 fu2 u3 fu3 u4 fu4
2 1 -2 -2 4 4 -8 -8 16 16
3 3 -1 -3 1 3 -1 -3 1 3
4 7 0 0 0 0 0 0 0 0
5 3 1 3 1 3 1 3 1 3
6 1 2 2 4 4 8 8 16 16
- 15 - 0 - 14 - 0 - 38
Solution :
We have h = 1
fu
1 = h =0
f

fu  = 0.933,
2
2
2 =  f 
h
 
fu  = 0,
3
3
3 =  f 
h
 
fu  = 2.533
4
4
4 =  f 
h
 
 = 2 = 0.933 = 0.966,
1 = 0, 2 = 2.91
3.13 Examples on Correlation :
1. The distribution for one variate x is called univariate distribution.
2. The distribution involving more than one variate is bivariate distribution.
3. If the change in one vanate x affects the change in other vaniate y then x, y are called
correlated.
4. If increase in x increases y then correlation is positive.
5. If decrease in x decreases y then correlation is positive.
6. If increase (decrease) in x decreases (increases) y then correlation is negative.
Gigatech Publishing House
Igniting Minds
Engineering Mathematics - III 3.59 Statistics

y
7. If x and y are only two variates and if x = constant then correlation is called as linear or
perfect otherwise it is called as nonlinear correlation.
8. The formula for measure the intensity or the degree of linear relationship between two
variates (variable) was developed by Karl Pearson and called as correlation coefficient.
cov (x,y)
9. Correlation coefficient of x, y = r = r (x, y) =
x y
 xy  x  y
Where cov (x, y) = covariance of x, y = n - n
  n 
nxy - (x) (y)
Correlation coefficient = r = r (x, y) =
nx - (x)2 ny2 - (y)2
2

10. cov (x, y) = cov (y, x)


11. cov (x, x) = variance x = 22
12. r always lies between – 1 + 1 ie. – 1 r 1
13. If r = 0 then there is no relation
14. If r = + 1 then relationship between x and y is very strong.

Illustrative Examples

Example : 1

Calculate the correlation coefficient between x and y from the following data.
x 78 89 99 60 59 79 68 61
y 125 137 156 112 107 136 123 108
Solution : Firstly we prepare the table
x x = x - 69 x2 y y = y - 112 y2 xy
78 9 81 125 13 169 117
89 20 400 137 25 625 500
99 30 900 156 44 1936 1320
60 -9 81 112 0 0 0
59 - 10 100 107 -5 25 50
79 10 100 136 24 576 240
68 -1 1 123 11 121 - 11
61 -8 64 108 -4 16 32
 41 1727 - 108 3468 2248
Gigatech Publishing House
Igniting Minds
Engineering Mathematics - III 3.60 Statistics

We have n = 8
nxy - (x) (y)
 r =
nx2 - (x)2 ny2 - (y)2
8(2248) - (41) (108)
=
8(1727) - (41)2 8(3468) - (108)2
13556
=
13968.95 = 0.97
Note:
(Here 69 112 are selected randomly to reduce the calculations by reducing given variates in
size)
Example : 2

Calculate the correlation coefficient of x, y from the following data.


x 9 8 7 6 5 4 3 2 1
y 15 16 14 13 11 12 10 8 9
Solution :
We have n = 9
Prepare the table
x y x2 y2 xy
9 15 81 225 135

8 16 64 256 128

7 14 49 196 98
6 13 36 169 78

5 11 25 121 55

4 12 16 144 48

3 10 9 100 30

2 8 4 64 16

1 9 1 81 9

45 108 285 1356 597


nxy - (x) (y)
 r =
nx2 - (x)2 ny2 - (y)2

Gigatech Publishing House


Igniting Minds
Engineering Mathematics - III 3.61 Statistics

9(597) - (45) (108)


=
9(285) - (45)2 9(1356) - (108)2
513
= = 0.95
540
Example : 3

Following data includes the production data of two items x and y year wise. Calculate
the correlation coefficient of x and y.
Year 2002 2003 2004 2005 2006 2007 2008 2009
x 100 102 104 107 105 112 103 99
y 15 12 13 11 12 12 19 26
Solution :

We have 
x =
100 + 102 + 104 + 107 + 105 + 112 + 103 + 99 832
= = 104
8 8

y =
15 + 12 + 13 + 11 + 12 + 12 + 19 + 26 120
= 8 = 15
8
We prepare the table firstly
x x = x - 104 x2 y y = y - 15 y2 xy
100 -4 16 15 0 0 0
102 -2 4 12 -3 9 6
104 0 0 13 -2 4 0
107 3 9 11 -4 16 - 12
105 1 1 12 -3 9 -3
112 8 64 12 -3 9 - 24
103 -1 1 19 4 16 -4
99 -5 25 26 11 121 - 55
 0 120 120 0 184 - 92
nxy - (x) (y)
 r =
nx2 - (x)2 ny2 - (y)2
8(- 92) - 0
=
8(120) - 0 8(184) - 0
- 92
= = - 0.619
120 184

Gigatech Publishing House


Igniting Minds
Engineering Mathematics - III 3.62 Statistics

Example : 4

Calculate the correlation coefficient from the following data.


x 48 35 17 23 47
y 45 20 40 25 45
Solution :
Hint - 
x = 34, 
y= 35, x = x – 34, y = y – 35,
x = 0, y = 0, x2 = 776, y2 = 550, xy = 280
 r = 0.429
Example : 5

Two examiners A and B awarded the marks to seven students as below.


Marks by A 40 44 28 30 45 38 31
Marks by B 32 39 26 30 28 34 27
Solution :
n = 7, Let x = x – 30, y = y - 30, x = 46, y = 6
xy = 153, x2 = 590, y2 = 115, xy = 153
nxy - (x) (y)
 r = = 0.638
nx2 - (x)2 ny2 - (y)2
Example : 6

Find the correlation coefficient of x and y if


n = 50, x = (x – 40) = 30, y = (y – 20) = 70, x2 = 170, y2 = 165,
xy = 140
Solution :
50(140) - (70) (30)
r =
8500 - 900 8250 - 4900
7000 - 2100
= = 0.0167
7600 3350
Example : 7

Find the correlation coefficient of x and y if


n = 25, x = 100, y = 125
x2 = 250, y2 = 500, xy = 2
Solution :

Gigatech Publishing House


Igniting Minds
Engineering Mathematics - III 3.63 Statistics

nxy - (x) (y)


 r =
nx - (x)2 ny2 - (y)2
2

25(212) - (100) (125)


= = 0.9646
6250 - (100)2 12500 - (125)2
Example : 8

Compute the coefficient of correlation from following data


x 10 14 18 22 26 30
y 18 12 24 6 30 36
Solution :

x =
120  126
6 = 20, y = 6 = 21
x = x – 20, y = y – 21, r = 0.60

Exercise No. 3

Example : 1

Calculate the correlation coefficient for the following data


x 2 4 5 6 8 11
y 18 12 10 8 7 6
Solution : r = - 0.92
Example : 2

Calculate the correlation coefficient from the following data.


Maths (x) 23 28 42 17 26 35 29 37 16 46
Stat (y) 25 22 38 21 27 39 24 32 18 44
Solution : x = x – 35, y = y – 39, r = 0.927
3.14 Examples on Lines of Regressions :
The line of best fit for the given distribution is called the line of regression
There are 02 such lines
(i) Line of regression of y on x :
y
y- y = r   (x -  x)
 x

y- y = b (x – 
yx x)
y = (byx) x + C

Gigatech Publishing House


Igniting Minds
Engineering Mathematics - III 3.64 Statistics

Where 
x= mean value of x,  y = mean value of y

byx = r   = regression coefficient of y on x
y
x 
and r is the correlation coefficient of x and y
(ii) Line of regression of x on y :
x
x-x = r   (y -  y)
y 
x-x = b (y -  y)
xy
x = (bxy) y + k
r x  = regression coefficient of y on x
Where bxy =   
 y
Note :
x   y  2
1. (bxy) (byx) = r  r  r
 y   x 
 r2 = (bxy) (byx)
2. The correlation coefficient r is geometric mean of two regression coefficients.
3. The point – (
x,
y) always lies on lines of regression.

byx = r
 x  = (r) y  = cov (x, y) y  = cov (x, y)
4.     
 y   x   x y  x  (x )2
r x cov (x, y) x cov (x, y)
5. bxy = . =
y x y y (y )2

Illustrative Examples

Example : 1

Two regressions equations of the variables x and y are given by


x = 19.13 – 0.87y and
y = 11.64 – 0.50 x
Find 
x,
y and correlation coefficient
Solution :
We know, (
x,
y) satisfies the line of regression

x = 19.13 – 0.87  y

y = 11.64 – 0.50 
x
Solving we get 
x = 15.79, 
y = 3.74
We have bxy = - 0.87, byx = - 0.50

Gigatech Publishing House


Igniting Minds
Engineering Mathematics - III 3.65 Statistics
2
We know, r = (bxy) (byx)
2
r = (- 0.87) (- 0.50) = 0.43
r = + 0.66

Example : 2

The equation of lines of regression are


2x = 8 – 3 y, 2y = 5 - x
Hence find 
x,
y and r
Solution :

x = 1, bxy = - 2 , 
3 1
y = 2, byx = - 2 , r = 0.866.

Example : 3

The equations of lines of regression are


x – 0.9075 y + 41.1375 = 0
0.48 x – y + 67.72 = 0
Then find 
x,
y,r
Solution :

x = 36, 
y = 85, r = 0.66
Example : 4

Find the correlation coefficient between x and y if two lines of regression are
2x – 9y + 6 = 0, x – 2y + 1 = 0
Solution :
Let the line of regression of x on y is 2x – 9y + 6 = 0

x = 2 y – 3
9

 
9
 bxy = 2

Let the line of regression of y on x is x – 2y + 1 = 0


1 1
 y = x+
2 2
1
 bxy = 2

We know, r2 = (bxy) (byx)

= 9 1 = 9
2 2 4
Gigatech Publishing House
Igniting Minds
Engineering Mathematics - III 3.66 Statistics

3
r = + 2 , > 1, which is wrong as – 1 r  1

Hence the selection of regression lines was wrong.


Let line of regression of x on y is x – 2y + 1 = 0
 x = 2y – 1
 bxy = 2
Let line of regression of y on x is 2x – 9y + 6 = 0
2 2
 y = 9x+3
2
 yx = 9

 r2 = (bxy) (byx)

(2) 9 = 9
2 4
=
 
2
 r = +3

Example : 5

If 
x = 36, 
y = 85, x = 11, y = 8, r = 0.66
Then find the lines of regression of x on y and y on x.
Solution :

1. Line x on y is x-
x =
x 
r   (y -  y)
y 
(0.66)  8  (y – 85)
11
x – 36 =
 
 x = - 41.1375 + 0.9075 y
y
2. Line of y on x is y - 
y = r   (x - 
x)
x 
(0.66) 11 (x – 36)
8
y – 85 =
 
 y = 67.72 + 0.48 x

Example : 6

From the following data, obtain


(i) two regression lines (ii) correlation coefficient (iii) value of y at x = 6.2
x 1 2 3 4 5 6 7 8 9

Gigatech Publishing House


Igniting Minds
Engineering Mathematics - III 3.67 Statistics

y 9 8 10 12 11 13 14 16 15
Solution :

x =
1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 + 9 45
= 9 = 5,
9

y =
9 + 8 + 10 + 12 + 11 + 13 + 14 + 16 + 15 108
= 9 = 12
9
Construct a table
x x=x-5 x2 y y = y - 12 y2 xy
1 -4 16 9 -3 09 12
2 -3 9 8 -4 16 12
3 -2 4 10 -2 4 4
4 -1 1 12 0 0 0
5 0 0 11 -1 1 0
6 1 1 13 1 1 1
7 2 4 14 2 4 8
8 3 9 16 4 16 12
9 4 16 15 3 9 12
- 0 60 - 0 60 57
nxy - (x) (y) 9(57) 57
i) We know r = 2 = = = 0.9
2 2 2
nx - (x) ny - (y) 9(60) 9(60) 60
y2 y2 60 60
ii) x =
n -n = 9 -0= 9
y2 y 2 60
iii) y =
n -n = 9
iv) Regression line of y on x is
y
y- y = r (x - 
x)
x
 6019
y – 12 = (0.95)   (x – 5)
 6019
 y = (0.95) x + 7.25
v) Regression line of x on y is
x
x- x = r (y - 
y)
y
Gigatech Publishing House
Igniting Minds
Engineering Mathematics - III 3.68 Statistics

 6019
x-5 = (0.95)   (y – 12)
 6019
 x = (0.95) y – 6.4
vi) y(6.2) = (0.95) (6.2) + 7.25 = 13.14
Example : 7

From the following data, obtain the regression lines


x 6 2 10 4 8
y 9 11 5 8 7
Hence find x, y at y = 10 and x = 20 respectively.
Solution :

x =
30  40
5 = 6, y = 5 = 8
Let, x = x – 6, y = y – 8, x = 0, y = 0, x2 = 40, y2 = 20, xy = - 26
nxy - (x) (y) 5(-26)
r = 2 2 2 2=
nx - (x) ny - (y) 5(40) 5(20)
- 26 - 26
= = = - 0.91
800 28.28
x y
bxy = r = - 1.3, byx = r = - 0.65
y x
i) Regression line of y on x is y – 8 = (- 0.65) (x – 6)
y = 11.9 - 0.65x

 y (20) = 1.1

ii) Regression line of x on y is x – 5 = (- 1.3) (y – 8)


x = 16.4 - 1.3y

 x (10) = 16.4 – 13
 x (10) = 3.4

Example : 8

Calculate the regression lines of x on y and y on x if following table gives the scores in
aptitude test and productivity of 10 workers selected at random.
Score 60 62 65 70 72 48 53 73 65 82
Productivity 68 60 62 80 85 40 52 62 60 81

Gigatech Publishing House


Igniting Minds
Engineering Mathematics - III 3.69 Statistics

Solution :

We have 
x =
650  650
10 = 65, y = 10 = 65
Let, x = x – 65, y = y – 65, x = 0, y = 0, x2 = 894, y2 = 1752, xy = 1044
x xy x xy 1044
bxy = r = = = = 0.596
y x y y y2 1752
y xy 1044
byx = r = = = 1.168
y x2 894
 Regression of line x on y is x = 26.26 + 0.596 y
 Regression of line y on x is y = - 10.92 + 1.168
Example : 9

Obtain the regression lines y on x and y at x = 70 from the following table


x 40 50 38 60 65 50 35
y 38 60 55 70 60 48 30
Solution :
 x 338
We have x =
n = 7 = 48.29,
 y 361
y = = = 51.57
n 7
Let, x = x – 48, y = y – 50, x2 = 774, y2 = 1173, xy = 732, x = 2, y = 11,
y
r   =
nxy - (x) (y) 7 (732) - (2) (11)
byx = = 7 (774) - (2)2 = 0.942
 x
 nx2 - (x)2

Regression of y on x is y-
y = byx (x - 
x)
y - 51.57 = (0.942) (x – 48.29)
 y = (0.942)x + 6.08
 y (x = 70) = (0.942) (70) + 6.08 = 65.94 + 6.08 = 7202 = 72.02

Exercise No. 4

Ex. 1 Find 
x, 
y and r from the following lines of regression.
i) y = 0.516x + 33.73, x = 0.512y + 32.52
[Ans: r = 0.514]
ii) y = 0.516x + 33.73, x = 0.512y + 32.52
[Ans: x = 67.6, 
y = 68.61]
Gigatech Publishing House
Igniting Minds
Engineering Mathematics - III 3.70 Statistics

iii) x = 0.4y + 6.4, y = - 0.6x + 406


[Ans: x = 6, 
y = 1, r = - 0.49]
iv) x = 19.17 – 0.87y, y = 11.64 – 0.50x
[Ans: 
x = 15.935, y = 3.67, r = - 0.435]
v) 3x + 2y = 26, 6x + y = 31
[Ans: 
x= 4, 
y= 7, r = - 0.5]
vi) 20x – 9y – 107 = 0, 4x – 5y + 33 = 0
[Ans: 
x = 13, 
y = 17, r]
vii) 7x – 16y + 9 = 0, 5y – 4x – 3 = 0

[Ans: 
x = - 29, 
3 15 3
y = 19 , r = 9 ]

Ex. 2 If two regression coefficient are 0.8, 0.4, what would be the correlation coefficient 
[Hint: r2 = (bxy) (byx) = (0.8) (0.4) = 0.32  r = + 0.56]
3.15 Examples on least square method for curve fitting :
1. The normal equations for fitting the line y = ax + b for the data having ‶n″ no. of
observations are
y = a x + nb
xy = a x2 + bx
2. The normal equations for fitting the parabolic curve
y = ax2 + bx + e, for the data having ‶n″ observations are
y = a x2 + bx + ne
xy = a x3 + bx2 + cx
x2y = ax4 + bx3 + cx2
Example : 1

Fit a line y = ax + b to the data


x -1 0 1 2 3 4 6
y 1 3 5 7 9 11 15
Solution :
Firstly construct a table as per requirements of normal equation.
x y x2 xy

Gigatech Publishing House


Igniting Minds
Engineering Mathematics - III 3.71 Statistics

0-1 1 1 -1
0 3 0 0
1 5 1 5
2 7 4 14
3 9 9 27
4 11 16 44
6 15 36 90
15 51 67 179
We have n = 7
Normal equations are
y = ax + nb
xy = ax2 + bx
 51 = 159 + 7b
179 = 679 + 15b
225 a + 105 b = 765
469 a + 105 b = 1253
 2449 = 488
 a = 2
Secondly 7b = 51 – 15a
= 51 – 30
= 21
 b = 3

 Required line is y = 2x + 3

Example : 2

Fit a parabolic curve y = a + bx + cx2 to the given data.


x -2 -1 0 1 2 3
y -1 2 3 2 -1 -6
Solution :
The normal equations to fit a parabola to the given data are
y = na + bx + cx2

Gigatech Publishing House


Igniting Minds
Engineering Mathematics - III 3.72 Statistics

xy = a x + bx + cx 2 3

x y =
2
ax2 + bx3 + cx3
Construct a table
x y x2 x3 x4 xy x2y
-2 -1 4 -8 16 2 -4
-1 2 1 -1 1 -2 2
0 3 0 0 0 0 0
1 2 1 1 1 2 2
2 -1 4 8 16 -2 -4
3 -6 9 27 81 - 18 - 54
=3 -1 19 27 115 - 18 - 58
We get 6a + 3b + 19c = -1
3a + 19b + 27c = - 18
19a + 27b + 115c = - 58
Solving we get, a = 3, b = 0, c = - 1
 Required parabola is y = 3 - x2

Exercise No. 5

Example : 1

Fit a line to the following data


x 1 2 3 4
y -1 4 9 14
Ans : y = 5x – 6
Example : 2

Fit a line p = mw + c to the following data


w 50 70 100 120
p 12 15 21 25
Ans. : p = 2.2785 + (0.1879) w
Example : 3

Fit a line to the given data


Gigatech Publishing House
Igniting Minds
Engineering Mathematics - III 3.73 Statistics

x 0 1 3 6 8
y 1 3 2 5 4
Ans. : y = 1.6 + (0.38)x
Example : 4

Fit a parabolic curve y = a + bx + cx2 to the following data


x -3 -2 -1 0 1 2 3
y 1.1 1.3 1.6 2.0 2.7 3.4 4.1
Ans. : Parabolic curve is y = 2.07 + (0.51)x + (0.06)x2
Example : 5

Fit a parabolic curve to the following data


x -2 -1 0 1 2 3 4
y 7 1 -1 1 7 17 31
Ans. : y = 2x – 1
2

Example : 6

Fit parabolic curve to the following data


(i)
x 0 1 2 3 4 5
y 1 3 7 13 21 31
2
Ans. : y=x +x+1
(ii)
x 1.0 1.5 2.0 2.5 3.0 3.5 4.0
y 1.1 1.3 1.6 2.0 2.7 3.4 4.1
Ans. : y = 1.04 – 198x + 0.244x 2

3.16 Curve Fitting : (Least Square Approximation):


Curve fitting is the process of constructing a curve, or mathematical function, that has
the best fit to a series of data points, possibly subject to constraints.
Let (xi, yi), i = 1, 2, 3, …., n be a given set of n pairs of values. x being independent
variable and y is dependent variable.
i.e. for equation y = f (x)
Fitting of curves to a set of numerical data is of ..,..,.., considerable importance i.e.
theoretical as well as practical importance in the study of correlation and regression.

Gigatech Publishing House


Igniting Minds
Engineering Mathematics - III 3.74 Statistics

Fitting of a straight line:


Let us consider the fitting of a straight line
y = a + bx to a set of n pts (xi, yi) for i = 1, 2, 3, ...., n
To determine a and b the normal equations are
yi = na + bxi
xi yi = axi + bxi2
Fitting of second degree parabola :
Let y = a + bx + cx2 …(I)
nd
be the II degree parabola of best fit to set of n pts.
(xi yi), i = 1, 2, …., n.
To determine a, b, and c the normal equations are
yi = na + bxi + cxi2
xi yi = ax + bxi2 + cxi3
xi2 yi = axi2 + bxi3 + cxi4
Example : 1

Fit a straight line to the following data.


x 1 2 3 4 6 8
y 2.4 3 3.6 4 5 6
Solution :
Let the straight line be y = a + bx
Then the normal equations are yi = 6a + bxi
xi yi = axi + bxi2 ….(i)
The values of xi, yi etc. are calculated below :
x y x2 xy
1 2.4 1 2.4
2 3 4 6
3 3.6 9 10.8
4 4 16 16
6 5 36 30
8 6 64 48
xi = 24 yi = 24 xi2 = 130 xi yi = 113.2
The equations (i) become 24 = 6a + 24b and
Gigatech Publishing House
Igniting Minds
Engineering Mathematics - III 3.75 Statistics

113.2 = 24a + 130b.


i.e. 4 = a + 4b ….(ii)
113.2 = 24a + 130b ….(iii)
Multiplying (ii) by 24 and subtracting from (iii),
We get a = 1.976 and b = 0.506.
Hence the required line of best fit is
y = 1.976 + 0.506x
Example : 2

Fit a straight line to the following data:


x: 6 7 7 8 8 8 9 9 10
y: 5 5 4 5 4 3 4 3 3
Solution :
Let the straight line be y = a + bx
Then the normal equations are yi = 9a + bxi
xi yi = axi + bxi2 ….(i)
The values are xi, yi etc. are calculated below.
x y x2 xy

6 5 36 30

7 5 49 35

7 4 49 28

8 5 64 40

8 4 64 32

8 3 64 24

9 4 81 36

9 3 81 27

10 3 100 30

xi = 72 yi = 36 xi2 = 588 xi yi = 282

The equations (i) become 36 = 9a + 72b and


282 = 72a + 588b
i.e., 4 = a + 8b ….(ii)
Gigatech Publishing House
Igniting Minds
Engineering Mathematics - III 3.76 Statistics

47 = 12a + 98b ….(iii)


Multiplying (ii) by 12 and subtracting from (iii),
we get b = - 0.5 and a = 8
Hence the required line of best fit is
y = 8 – 0.5x.
Example : 3

Fit a straight line to the following data :


x: 12 15 21 25
y: 50 70 100 120
Solution :
Let the straight line be y = a + bx
Then the normal equations are yi = 4a + bxi
xi yi = axi + bxi2 ….(i)
The values of xi, yi etc. are calculated below.
x y x2 xy
12 50 144 600
15 70 225 1050
21 100 441 2160
25 120 625 3000
xi = 73 yi = 340 xi2 = 1435 xi yi = 6750
The equations (i) become 340 = 40 + 73b and,
6750 = 73a + 1435b
i.e., 340 = 40 + 73b ….(ii)
6750 = 73a + 1435b ….(iii)
73
Multiplying (ii) by 4 and subtracting from (iii),

We get a = - 11.7998 and b = 5.3041.


Hence the required line of best fit is
y = - 11.7998 + 5.3041x.
Note :
For the sake of convenience it is sometime advisable to change the origin and scale with
the substitutions X = (x – A)/h and Y = (y – B)/h, where A and B are the assumed means (or
middle values) of x and y respectively and h is the width of the interval.
Gigatech Publishing House
Igniting Minds
Engineering Mathematics - III 3.77 Statistics

Example : 4

Fit a second degree parabola to the following data


x: 1.0 1.5 2.0 2.5 3.0 3.5 4.0
y: 1.1 1.3 1.6 2.0 2.7 3.4 4.1
Solution :
Let the second degree parabola of fit be
y = a + bx + cx2
x-x
Here we take X =
h

Where 
x = 2.5, h = 0.5
x - 2.5
 X =
0.5
So the parabola of fit y = a + bx + cx2 becomes
y = a + bx + cx2 …(i)
The normal equations are yi = na + bxi + cxi2
xi yi = axi + bxi2 + cxi3
xi2 yi = axi2 + bxi3 + cxi4 …(ii)
The values of xi, yi etc. are calculated below :
x - 2.5
x y X=
0.5 x2 x3 x4 xy x2y

1.0 1.1 -3 9 - 27 81 - 3.3 9.9


1.5 1.3 -2 4 -8 16 - 2.6 5.2
2.0 1.6 -1 1 -1 1 - 1.6 1.6
2.5 2.0 0 0 0 0 0 0
3.0 2.7 1 1 1 1 2.7 2.7
3.5 3.4 2 4 8 16 6.8 13.6
4.0 4.1 3 9 27 81 12.3 36.9
yi = 16.2 xi = 0 xi2 = 28 xi3 = 0 xi4 = 196 xi yi = 14.3 69.9
The equations (iii) become
16.2 = 7a + 28c …(iii)
14.3 = 28b ….(iv)
69.9 = 28a + 196 ….(v)
Gigatech Publishing House
Igniting Minds
Engineering Mathematics - III 3.78 Statistics

From (iv), b = 0.511


Multiplying (iii) by 4 and subtracting from (v)
we get a = 2.07, c = 0.061
Equation (i) becomes,

a + b  0.5  + c  0.5 
x - 2.5 x - 2.5 2
y =
   
y = a + b (2x – 5) + c (2x – 5)2
y = 2.07 + 0.511 (2x – 5) + 0.061 (2x – 5)2
y = 2.07 + 1.022x – 2.555 + 0.244x2 – 1.22x + 1.525
y = 1.04 – 0.198x + 0.244x2
which is required equation of parabola.
Example : 5

Fit a second degree parabola to the following data


x: 0 1 2 3 4
y: 1 1.8 1.3 2.5 6.3
Solution :
Let the second degree parabola of fit be
y = a + bx + cx2
x-x
we take X = h

Where 
x = 2, h = 1
 X = x-2
So the parabola of fit y = a + bx + cx2 becomes
y = a + bx + cx2 …(i)
The normal equations are
yi = na + bxi + cxi2
xi yi = axi + bxi2 + cxi3
xi2 yi = axi2 + bxi3 + cxi4 …(ii)
The values of xi, yi etc. are calculated below.
x y X=x-2 x2 x3 x4 xy x2y
0 1 -2 4 -8 16 -2 4
1 1.8 -1 1 -1 1 - 1.8 1.8

Gigatech Publishing House


Igniting Minds
Engineering Mathematics - III 3.79 Statistics

2 1.3 0 0 0 0 0 0
3 2.5 1 1 1 1 2.5 2.5
4 6.3 2 4 8 16 12.6 25.2
yi = 12.9 xi = 0 xi2 = 10 xi3 = 0 xi4 = 34 xi yi = 11.3 xi2yi =
33.5
The equations (ii) become 12.9 = 5a + 10c …(iii)
11.3 = 10b …(iv)
33.5 = 10a + 34c …(v)
From (iv), b = 1.13
Multiplying (iii) by 2 and subtracting from (v)
we get, a = 1.48, c = 0.55
Equation (i) becomes,
y = a + b (x – 2) + c (x – 2)2
y = 1.48 + 1.13 (x – 2) + 0.55 (x – 2)2
y = 1.48 + 1.13x – 2.26 + 0.55x2 – 2.2x + 2.2
y = 1.42 – 1.07x + 0.55x2
which is required equation of parabola.
Example : 6

Following is the data given for values of x and y.


Fit a second degree polynomial of the type ax2 + bx + c, where a, b, c are constants.
x: -3 -2 -1 0 1 2 3
y: 12 4 1 2 7 15 30
Solution :
Let the second degree parabola of fit be
y = ax2 + bx + c …(i)
The normal equations are
yi = cn + bxi + axi2
xi yi = cxi + bxi2 + axi3
xi2 yi = cxi2 + bxi3 + axi4 …(ii)
The values of xi, yi etc. are calculated below :
x y x2 x3 x4 xy x2y
-3 12 9 - 27 81 - 36 108

Gigatech Publishing House


Igniting Minds
Engineering Mathematics - III 3.80 Statistics

-2 4 4 -8 16 -8 16
-1 1 1 -1 1 -1 1
0 2 0 0 0 0 0
1 7 1 1 1 7 7
2 15 4 8 16 30 60
3 30 9 27 81 90 270
xi = 0 yi = 71 xi2 = 28 xi3 =0 xi4 = 196 xi yi = 82 xi2yi = 462
The equations (ii) become 71 = 7c + 28a …(iii)
82 = 28b …(iv)
462 = 28c + 196a …(v)
From (iv), b = 2.92
Multiplying (iii) by 4 and subtracting from (v)
we get, a = 2.14, c = 1.52
 a = 2.14, b = 2.92, c = 1.52
Equation (i) becomes,
y = 2.14x2 + 2.92x + 1.52
which is required equation of parabola.
Example : 7

Fit a straight line to the following data.


x: 0 1 2 3
y: 2 5 8 11
Solution :
Let the straight line be y = a + bx
Then the normal equations are yi = 4a + bxi
xi yi = axi + bxi2 …(i)
The values of xi, yi etc. are calculated below.
x y x2 xy
0 2 0 0
1 5 1 5
2 8 4 16

Gigatech Publishing House


Igniting Minds
Engineering Mathematics - III 3.81 Statistics

3 11 9 33
xi = 6 yi = 26 xi2 = 14 xi yi = 54
The equations (i) become 26 = 4a + 6b
and 54 = 6a + 14b
i.e. 13 = 2a + 3b …(ii)
27 = 3a + 7b …(iii)
Multiplying (ii) by 3 and (iii) by 2, then subtracting (ii) from (iii)
we get, a = 2, and b = 3
Hence the required line of best fit is
y = 2 + 3x
Example : 8

Fit a straight line of the type y = a + bx to the following data. Nov – 2017

x: 0 5 10 15 20 25
y: 12 15 17 22 24 30
Solution :
Let the straight line be y = a + bx
Then the normal equations are yi = 6a + bxi
xi yi = axi + bxi2 …(i)
The values of xi, yi etc. are calculated below :
x y x2 xy
0 12 0 0
5 15 25 75
10 17 100 170
15 22 225 330
20 24 400 480
25 30 625 750
xi = 75 yi = 120 xi2 = 1375 xi yi = 1805
The equations (i) become 120 = 6a + 75b
and 1805 = 75a + 1375b
i.e. 40 = 2a + 25b …(ii)
361 = 15a + 275b …(iii)
Gigatech Publishing House
Igniting Minds
Engineering Mathematics - III 3.82 Statistics

Multiplying (ii) by 11 and subtracting from (iii)


we get, a = 11.2857, and b = 0.6971
Hence the required line of best fit is
y = 11.2857 + 0.6971x
Example : 9

Fit a second degree parabola to the following data.


x: 0 1 2 3 4
y: 1 0 3 10 21
Solution :
We want to estimate an equation of type y = a + bx + cx2
x- x
Here we take x =
h

Where 
x = 2, h = 1
Example : 10

Fit a second degree parabola to the function 2x at points 0, 1, 2, 3, 4.


Solution :
Here y = 2x at 0, 1, 2, 3, 4.
x: 0 1 2 3 4
y: 1 2 4 8 16
Let the parabola of fit be y = a + bx + cx2
x-x
We take x =
h

Where 
x = 2, h = 1
 x = x–2
So the parabola of fit y = a + bx + cx2 becomes
y = a + bx + cx2 …(i)
The normal equations are
yi = na + bxi + cxi2
xi yi = axi + bxi2 + cxi3
xi2 yi = axi2 + bxi3 + cxi4 …(ii)
The values of xi, yi etc. are calculated below.

Gigatech Publishing House


Igniting Minds
Engineering Mathematics - III 3.83 Statistics

x y X=x-2 x2 x3 x4 xy x2y
0 1 -2 4 -8 16 -2 4
1 2 -1 1 -1 1 -2 2
2 4 0 0 0 0 0 0
3 8 1 1 1 1 8 8
4 16 2 4 8 16 32 64
yi = 31 xi = 0 xi2 = 10 xi3 =0 xi4 = 34 xi yi = 36 xi2yi = 78
The equations (ii) become 31 = 5a + 10c …(iii)
36 = 10b …(iv)
78 = 10a + 34c …(v)
From (iv), b = 3.6
x = x – 2, (h = 1)
Let, x = x- 2 so that the parabola of fit y = a + bx + cx2
becomes y = a + bx + cx2 …(i)
The normal equations are
yi = na + bxi + cxi2
xi yi = axi + bxi2 + cxi3
xi2 yi = axi2 + bxi3 + cxi4 …(ii)
The values of xi, yi etc. are calculated below.
x y X=x-2 x2 x3 x4 xy x2y
0 1 -2 4 -8 16 -2 4
1 0 -1 1 -1 1 0 0
2 3 0 0 0 0 0 0
3 10 1 1 1 1 10 10
4 21 2 4 8 16 42 84
yi = 35 xi = 0 xi2 = 10 xi3 = 0 xi4 = 34 xi yi = 50 xi2yi = 98
The equations (ii) become 35 = 5a + 10c …(iii)
50 = 10b …(iv)
98 = 10a + 34c …(v)
From (iv), b = 5
Multiplying (iii) by 2 and subtracting from (v)
Gigatech Publishing House
Igniting Minds
Engineering Mathematics - III 3.84 Statistics

we get, a = 3, c = 2
 Equation (i) becomes,
y = a + b (x – 2) + c (x – 2)2
y = 3 + 5 (x – 2) + 2 (x – 2)2
y = 3 + 5x – 10 + 2x2 – 8x + 8
y = 1 – 3x + 2x2
which is required equation of parabola.
Multiplying (iii) by 2 and subtracting from (v)
we get, c = 1.142, a = 3.916
 Equation (i) becomes,
y = a + b (x – 2) + c (x – 2)2
y = 3.916 + 3.6 (x – 2) + 1.142 (x – 2)2
y = 3.916 + 3.6x – 7.2 + 1.142x2 – 4.568x + 4.568
y = 1.284 – 0.968x + 1.142x2
which is required equation of parabola.

Exercise No. 6
1. Fit a straight line to the following data
x: 0 1 2 3 4
y: 1 1.8 3.3 4.5 6.3
Ans. : y = 0.72 + 1.33x
2. Find the best values of a, b assuming that the following values of x, y are connected by
the relation y = a + bx
x: 0 1 2 3 4
y: 1 2.9 4.8 6.7 8.6
Ans. : a = 1, b = 1.9
3. S. T. the line of fit to the following data is given by y = 3.9 + 1.5x.
x: 1 2 3 4 5
y: 5 7 9 10 11
4. Obtain the least squares line fit to the following data.

Gigatech Publishing House


Igniting Minds
Engineering Mathematics - III 3.85 Statistics

x: 0.2 0.4 0.6 0.8 1


y: 0.447 0.632 0.775 0.894 1
Ans. : y = 0.3392 + 0.684x
5. Fit a second degree parabola for the following data.
x: -2 -1 0 1 2
y: 15 1 1 3 19
Ans. : y = - 1.057 + x + 4.4285x2
6. Fit a second degree parabola to the following data.
x: 1989 1990 1991 1992 1993 1994 1995 1996 1997
y: 352 356 357 358 360 361 361 360 359
Ans. : y = - 1062526.37 + 1066.37x – 0.267x2
7. Fit a straight line to the following data by least square method. May – 2018

x: 0 6 8 10 14 16 18 20
y: 3 12 15 18 24 27 30 33
Ans. : y = 1.5a + 3
8. By the method of least square, find the straight line that best fits the following data.
May – 2017

x: 1 2 3 4 5
y: 14 27 40 55 68
Ans. : y = 13.6x
9. Fit a straight line to the following data by least square method.
x: 1 2 3 4 5 6
y: 6 4 3 5 4 2
Ans. : y = 5.7999 – 0.514x
10. Fit a second degree parabola for the following data.
x: 1 2 3 4
y: 1.7 1.8 2.3 3.2
Ans. : y = 2 – 0.5x + 0.2x2
11. Fit a least square straight line to the following data.
Gigatech Publishing House
Igniting Minds
Engineering Mathematics - III 3.86 Statistics

x: 2 7 9 1 5 12
y: 13 21 23 14 15 21
Ans. : y = 12.45 + 0.8977x
12. Fit a least square straight line to the following data.
x: 2 4 6 8 10 12
y: 1.8 1.5 1.4 1.1 1.1 0.9
Ans. : y = 1.9 – 0.086x
13. Fit a least square straight line to the following data.
x: 20 60 100 140 180 220 260 300 340 380
y: 0.18 0.37 0.35 0.78 0.56 0.75 1.18 1.36 1.17 1.65
Ans. : y = 0.069 + 0.0038x
14. Fit a straight line to the following data by least square method.
x: 1 3 4 6 8 9 11 14
y: 1 2 4 4 5 7 8 9
Ans. : y = 0.5454 + 0.6363x
15. In a study between the amount of rainfall and the quantity of air pollution removed the
following data were collected.
Daily Rainfall in 0.01 cm(x) 4.3 4.5 5.9 5.6 6.1 5.2 3.8 2.1

Pollution Removed (mg/m3)(y) 12.6 12.1 11.6 11.8 11.4 11.8 13.2 14.1

Obtain by the method of least square, a relation of the form y = a + bx which best fits to
these observations.
Ans. : y = 15.49 – 0.675x
16. If a curve of the form x = ay2 + by + c satisfies the data.
x: -6 -8 -4 6 22 44 72
y: 0 1 2 3 4 5 6
Find the best values of a, b, c.
Ans. : a = 3, b = - 5, c = - 6
17. Fit a parabola of the form y = ax2 + by + c to the following data using least square
criteria.

Gigatech Publishing House


Igniting Minds
Engineering Mathematics - III 3.87 Statistics

x: 1 2 3 4 5 6 7
y: -5 -2 5 16 31 50 73
Ans. : y = 2x2 – 3x - 4
18. Values of x and y are tabulated as under.
x: 1 1.5 2.0 2.5
y: 25 56.2 100 156
Find the law of the form x = ayn to satisfy the given by data.
[Hint : x = ayn, Taking logarithm we get
log x = log a + n log y
 x = c + nY, where log x = X, log a = C, log y = Y
which is a straight line equation].
Ans. : n = 0.5, c = - 0.6988, a = 0.2, x = 0.2 y0.5
19. Find the best values of a, b, c assuming that the following values of x, y are connected
by the relation y = ax2 + bx + c
x: 1 2 3 4 5
y: 3.38 8.25 16.6 28.5 44
Ans. : a = 1.772, b = - 0.383, c = 2.103
20. Fit a parabola y = a + bx + cx2 to the following data.
x: 1 2 3 4 5 6 7 8 9
y: 2 6 7 8 10 11 11 10 9
Ans. : y = 7.4 + 0.85x + 0.1232x2

Illustrative Examples

Example : 1

Fit a straight line to the following data.


x: o 1 2 3 4
y: 1 1.8 3.3 4.5 6.3
Solution :
Let the straight line to be fitted to the data by
y = a + bx. then the normal equations are.
y = na + bx. ; xy = ax + bx2

Gigatech Publishing House


Igniting Minds
Engineering Mathematics - III 3.88 Statistics

x y xy x2
0 1 0 0
1 1.8 1.8 1
2 3.3 6.6 4
3 4.5 13.5 9
4 6.3 25.2 16
10 16.9 47.1 30
Here n = 5
16.9 = 5 a + 10 b …(i)
47.1 = 10 a + 30 b …(ii)
Solving (i) and (ii) we get a = 0.72 b = 1.33
Hence the equation of the line of best fit is
y = 0.72 + 1.33x.
Example : 2

Fit a straight line to the following data treating y as independent variable.


x 1 2 3 4 5
y 5 7 9 10 11
Solution :
Let the line of best fit be.
X = a + by,
Then normal equations are
x = na + by ; xy = ay + by2
x y xy y2
1 5 5 25
2 7 14 49
3 9 27 81
4 10 40 100
5 11 55 121
15 42 141 376
15 = 5a + 42 b  5a = 42 b – 15

Gigatech Publishing House


Igniting Minds
Engineering Mathematics - III 3.89 Statistics

42
141 = 42 a + 376 b a= 5 b–3

42  b - 3 + 376 b
42
141 =
5 
1764
141 = b – 126 + 376 b.
5
3644 267  5
267 =
5 b. b = 3644
1335
b = = 0.3663556 = 0.37
3644
a = (8.4) (0.37) – 3= 0.108 = 0.11
x = 0.11 + 0.37 y.

Example : 3

Fit a second degree parabola to the following data.


x 1 2 3 4 5 6 7 8 9
y 2 6 7 8 10 11 11 10 9

x = 45, 
x = 5, y = 74, 
y = 8.22
Solution :
Let x = x – 5 and y = y – 7. and let the curve of best fit be
y = a + bx + cx2. The normal equations are
y = na + bx + cx2
xy = ax + bx2 + cx3
x2y = ax2 + bx3 + cx4.
x y x y xy x2 x3 x4
1 2 -4 -5 20 16 - 64 256
2 6 -3 -1 3 9 - 27 81
3 7 -2 0 0 4 -8 16
4 8 -1 1 -1 1 -1 1
5 10 0 3 0 0 0 0
6 11 1 4 4 1 1 1
7 11 2 4 8 4 8 16
8 10 -3 3 -9 9 27 81

Gigatech Publishing House


Igniting Minds
Engineering Mathematics - III 3.90 Statistics

9 9 4 2 8 16 64 256
 = 46 74 0 11 51 60 0 708
11 = 9 a + 6c
51 = 60 b
-9 = 60 a + 708 c
a = 3, b = 0.85
c = - 0.27
Hence the curve of best fit is
y = 3 + 0.85x – 0.27 x2
y = 3 + 0.85 (x – 5) + 0.27 (x – 5)2
y = 3 + 0.85 x – 4.25 – 0.27x2 + 2.7 x – 6.75
y = - 1 + 3.55 x – 0.77 x2
Example : 4

Use method of least squares to fit y = mx + c to


x 0 1 2 3 4 5
y -8 -5 -2 1 4 7
Solution :
Let the equation of st. line of best fit y = mx + c.
Normal equation form ‘m’ is
xy = mx2 + cx …(i)
y = mx + 6c …(ii)
6 = No. of absent we prepare the table.
x y x2 xy
0 -8 0 0
1 -5 1 -5
2 -2 4 -4
3 1 9 3
4 4 16 16
5 7 25 35
x = 15, y = - 3, x = 55, xy = 45.
2

Equation (i) and (ii) become


m(55) + c (15) = 45

Gigatech Publishing House


Igniting Minds
Engineering Mathematics - III 3.91 Statistics

m(15) + c (6) = -3
Solving the above equations simultaneously.
We get m = 3 and c = -8
Example : 5

If x = ay2 + by + c, find a, b, c by least square method.


Solution : Let x = ay2 + by + c  (i) Normal equations for a, b, c are
y2x = ay4 + by3 + cy2  (ii) y = 15, x = 85
yx = ay3 + by2 + cy  (iii) yx = 345, y2 = 155
x = ay2 + by + 5c  (iv) y2x = 1503 y3 = 225
y4 = 979
y x yx y2 y2x y3 y4
1 3 3 1 3 1 1
2 6 12 4 24 8 16
3 13 39 9 1.17 27 81
4 24 96 16 384 64 256
5 39 195 25 975 125 625
 (i) (iii) (iv) become.
a (979) + b (225) + c (55) = 1503
a (225) + b (55) + c (15) = 345
a (55) + b (15) + c (5) = 85
Solving by cramer’s rule a = 2, b = - 3, c = 4
 x = 2x2 – 3y + 4.
3.17 Non – polynomial Approximatim :
In certain problems of science and engineering often we have to fit an empirical law, a
non-polynomial function to data, the most used empirical laws are
(i) y = a ebx (ii) y = axb (iii) y = a (bx)
These formulas can be reduced to linear equations by taking logarithms and introducing
new variables.
1. y = a ebx yields
Log y = Log a + bx
i.e. Y = c0 + c1 x
where Y = Log y, c0 = Log a, c1 = b.
2. Similarly y = axb yields
Gigatech Publishing House
Igniting Minds
Engineering Mathematics - III 3.92 Statistics

Log y = Log a + b Log x


i.e. Y = c0 + c1X
where Y = Logy, c0 = Log a, x = Log x, c1 = b.
3. y = abx yields
Log y = Log a + x Log b
Y = c0 + c 1
where Y = Log y, c0 = Log a, c1 = Log b.
By least square approximation we determine c0, c1 and then a and b are found.

Illustrative Examples

Example : 1

Determine a and b so that y = a edx fits the data.


X: 1 2 3 4
Y: 7 11 17 27
dx
Solution : Y = a e yields on having logarithm
Log y = Log a + bx
Y = c0 + c 1 x
We have Y Log y c0 Log a – c1 = b
Hence we have the table.
X: 1 2 3 4
Y: 1.96 2.40 2.63 3.30
The second line in this table is obtained by taking the logarithm of the entries in the original
table.
Now x = 4, x1 = 10, xi2 = 30, yi = 10.45, xiyi = 28.44.
yielding the normal equations
4 c0 + 10 c1 = 10.48
Solving we get c0 = 1.5, c1 = 0.45
10 c0 + 30 c1 = 28.44
Hence a = eco = e1.5 = 4.48, b = c1 = 0.45
Example : 2

Fit an equation of the type y = abx to the following data by the method of least square
technique.

Gigatech Publishing House


Igniting Minds
Engineering Mathematics - III 3.93 Statistics

x: 2 3 4 5 6
y: 144 172.8 207.4 248.8 298.5
Solution : Taking logarithm to y = abx this equation becomes
Log y = Log a + (Log b) x
where Y = Log y, c0 = Log a, c1 = Log b.
Y = c0 + c 1 x
We have the following table
x Y = log y x2 xY
2 4.97 4 9.94
3 5.15 9 15.45
4 5.33 16 21.32
5 5.52 25 27.60
6 5.70 36 34.20
20 26.67 90 108.51
The normal equations are
Y = c01 + c1x
xY = c0x + c1x2
26.67 = 5 c0 + 20 c1
108.51 = 20c0 + 90 c1
By using cramer’s rule we solve
The above equations we get
c0 - c1 1
= =
230.1 - 9.15 50
 c0 = 4.602, c1 = 0.183
Now, Log a = c0  a = eco = e4.602 = 99.68
Log b = c1  b = ec1 = e0.183 = 1.2008
 y = (99.68) (1.2008)x.

Exercise No. 6
Example : 1

Determine the constants in y = aebx fitting the data.

Gigatech Publishing House


Igniting Minds
Engineering Mathematics - III 3.94 Statistics

x: 1 2 3 4
f: 60 30 20 15
Ans.: a = 84.8, b = - 0.456
Example : 2

Find the constants in y = aebx fitting the data.


x: 77 100 185 239 285
f: 2.1 3.4 7.0 11.1 19.6
Ans.: a = 1.2, b = 0.0096
Example : 3

Determine the constants in y = axb fitting the data.


x: 2.2 2.7 3.5 4.1
f: 65 60 53 50
Ans.: a = 91.8, b = - 0.434
Example : 4

By the method of least squares fit the following data.


y = ax + b
x: -1 0 2 3 4 5
y: -9 -7 -3 -1 1 3
Ans.: a = 0.64, b = - 2.43


Gigatech Publishing House


Igniting Minds
3
Unit
Statistics


1. The arithmetic mean (x ) of following distribution

x 0 1 2 4

f 1 4 3 2

Where f is frequency of corresponding variable, is given by


(a) 1 (b) 1.2 (c) 1.8 (d) 2
Ans.: (b)
Explanation : for frequency distribution e
  fx (0) (1) + (1) (4) + (2) (3) + (4) (2)
x = =
f 1+4+3+2
0 + 4 + 6 + 8 18
= 10 = 10 = 1.8

2. The arithmetic mean (x ) of following distribution

x 1 2 3 4 5

f 2 4 5 3 1

Where f is frequency of corresponding variable, is given by


(a) 2 (b) 3 (c) 2.8 (d) 2.5
Ans.: (c)
Explanation :
  fx (1) (2) + (2) (4) + (3) (5) + (4) (3) + (5) (1)
x = =
f 2+4+5+3+1
2 + 8 + 15 + 12 + 5 42
= 15 = 15 = 2.8

3. The standard deviation () of a distribution 2, 3, 5 is given by


(a) 2 (b) 3.88 (c) 1.25 (d) 4.12
Ans.: (c)

Gigatech Publication House


Igniting Minds
Engineering Mathematics –III M3.2 Statistics (MCQ’s)

Explanation :
 x1
2
 2 38
 = n (x ) = 3 – (3.33) = 1.25
2


4. For a given distribution if value of  fx2 = 188, N =  f = 10 x = 3.5 then value of
standard deviation () is given by
(a) 2.56 (b) 4.12 (c) 4.88 (d) 5.13
Ans.: (a)
Explanation :
 x2 
 = n (x ) = 18.8 – (3.5) = 2.56
2 2


5. For a given distribution if value of  fx2 = 122 , N =  f = 5, x = 4 then value of
standard deviation () is given by
(a) 2 (b) 3.88 (c) 5.13 (d) 2.90
Ans.: (a)
Explanation :
 x2  2
 = n (x ) = 24.4 – 16 = 2.90
6. The standard deviation of the following frequency distribution is

Wages in rupees 0 – 10 10 – 20 20 – 30
earned per day

No. of labours 5 9 15

(a) 9.32 (b) 7.55 (c) 8.30 (d) 12.70


Ans.: (b)
Explanation :Let A = 15 Here h = 10
Wages in Middle No. of x–A f1 u1 2
u= h f1 u1
rupees value x labours f
earned per A = 15
day
0 – 10 5 5 –1 –5 5
10 – 20 15 9 0 0 0
20 – 30 25 15 1 15 15
 – 29 – 10 20

Gigatech Publication House


Igniting Minds
Engineering Mathematics –III M3.3 Statistics (MCQ’s)

 f1 u1  f1 u12
2

S.D. =  = h – 
f1   f1 
20 102
= 1029 – 29 = 7.55
7. The following table gives the marks obtained in a paper of statistics out of 25
Class interval 0–5 5 – 10 10 – 15 15 – 20 20 – 25
No. of students 2 6 8 8 15
The standard deviation (S.D) is ________
(a) 7.29 (b) 7.35 (c) 6.30 (d) 5.75
Ans.: (c)
Explanation :
 f1 u1  f1 u12
2

S.D =  = h – 
 f1   f1 
x–A 2
Class Middle value No. of f1 u1 f1 u1
u= h
interval x students f
A = 12.5
0–5 2.5 2 –1 –2 2
5 – 10 7.5 6 – 0.5 –3 1.5
10 – 15 12.5 8 0 0 0
15 – 20 17.5 8 0.5 4 2
20 – 25 22.5 15 1 15 15
 – 39 – 14 20.5
2
⸫ S.D. = = 10 20.5 – 14 = 6.30
 39  39
8. From the following data
Team S.D () A.M‒
x
A 2 5
B 2.5 4
The more consistent team is
(a) Team A (b) Team B
(c) Team equally consistent (d) can’t say
Ans.: (a)
Gigatech Publication House
Igniting Minds
Engineering Mathematics –III M3.4 Statistics (MCQ’s)

Explanation :
We have coefficient of variation

(C.V.) = x  100
2
For team A, (C.V)A = 5  100 = 40
2.5
For team B, (C.V)B = 4  100 = 62.5

As (C.V.)A< (C.V)B, team A is more consistent


9. From the following data which term is more consistent
Team S.D () A.M ‒
x
A 2.5 10
B 4 16
(a) A (b) B (c) Equally consistent (d)Can’t say
Ans.: (c)
Explanation :
 2.5
For team A, (C.V.)A = x  100 = 10  100

(C.V.)A = 25
4
For team B, (C.V.)B = 16  100 = 25

⸫ (C.V.)A = (C.V.)B, Both teams are equally consistent


10. From the following data : the most consistent player is
Player S.D () A.M (‒
x)
Sachin 10 45
Rahul 13 38
Sourav 11 42
(a) Sachin (b) Rahul (c) Sourav (d) EquallyConsistent
Ans.: (a)
Explanation :
 10
For Sachin, (C.V.)A = x  100 = 45  100 = 22.22
13
For Rahul, (C.V.)B = 38  100 = 34.21

Gigatech Publication House


Igniting Minds
Engineering Mathematics –III M3.5 Statistics (MCQ’s)

11
For Saurav, (C.V.)C = 42  100 = 26.19

Since (C.V)A< (C.V.)C< (C.V.)B


Sachin is most consistent player
11. If arithmetic mean of four numbers is 15, one item 19 is replaced by 23, then new
arithmetic mean is,
(a) 16 (b) 17 (c) 18 (d) 20
Ans.: (a)
Explanation :
‒  xi
Given, x n =
x1 + x2 + x2 + x4
⸫ 15 = 4
60 = x1 + x2 + x3 + x4
Let x1 = 19
⸫ x2 + x3 + x4 = 60 – 19 = 41 is mean of three numbers
⸫ New mean Now x1 = 23
⸫ x1 + x2 + x3 + x4 = 41 + 23 = 64
64
Mean = 4 = 16

New arithmetic mean = 16


12. The first moment of the distribution about the value 4 is 6, Arithmetic mean of the
distribution is,
(a) 11 (b) 15 (c) 10 (d) 12
Ans.: (c)
Explanation :
Given : a = 4, 1 = 6

⸫ x = 1 + a = 6 + 4 = 10
13. The first and second moments of the distribution about the value 2 are 3 and 17,
second moment about the mean is,
(a) – 6 (b) 7 (c) – 8 (d) 8
Ans.: (d)
Explanation :
Given ʹ1 = 3 ʹ2 = 17
and a = 2

Gigatech Publication House


Igniting Minds
Engineering Mathematics –III M3.6 Statistics (MCQ’s)

Second moment about mean is,


2 = ʹ2–(ʹ1)2 = 17 – (3)2 = 17 – 9
2 = 8
14. The first four moments about the working mean A = 5 are – 1, 10, 11, 16 then value
of second moment 2 about mean is given by
(a) 10 (b) 9 (c) 8 (d) 7
Ans.: (b)
Explanation :
2 = 2–(ʹ2)2= 10 – (–1)2 = 9
15. If the first four moments of a distribution about the working mean A = 2 are 1, 5, –
20, 10 then the second moment 2 about mean of distribution is given by
(a) 1 (b) 2 (c) 3 (d) 4
Ans.: (d)
Explanation :
2 = ʹ2(ʹ2)2= 5 – (–1)2 = 4
16. If the first four moments of a distribution about the working mean A = 2 are 1, 5, –
20, 10 then the third moment 3 about mean of distribution is given by
(a) – 10 (b) 30 (c) – 33 (d) – 20
Ans.: (c)
Explanation :
3 = ʹ3 – 3 ʹ2ʹ1 + 2 (ʹ1)2
= – 20 – 3 (5) (1) + 2 (1)2
= – 20 – 15 + 2 = – 33
17. If the first four moments of a distribution about the working mean A = 2 are 1, 5, – 20,
19 then the variance of distribution is given by
(a) 1 (b) 2 (c) 3 (d) 4
Ans.: (d)
Explanation :
Variance = (S.D)2
Now 2 = ʹ2(ʹ1)2 = 5 – (1)2 = 4

S.D = 2 = 4 = + 2
 Variance = = (2)2 = 4

Gigatech Publication House


Igniting Minds
Engineering Mathematics –III M3.7 Statistics (MCQ’s)

18. The value of second and third moments about mean of distribution are 2.83 and 2.38
respectively. The coefficient of skewness1 is equal to
(a) 0.10 (b) 0.50 (c) 0.20 (d) 0.30
Ans.: (b)
Explanation :
2
3 1(3.38)2
1 = 3 = (2.83)3 = 0.5040
2
19. The first four moments about the working mean 44.5 of a distribution are – 0.4, 2.99, –
0.08 and 27.63 then value of moments about mean 2 is ________
(a) 2.83 (b) 3.99 (c) 2.2 (d) 5.9
Ans.: (a)
Explanation :
Given A = 44.5
ʹ1 = – 0.4 , ʹ2 = 2.99 , ʹ3 = – 0.008 , ʹ4 = 27.63
2 = ʹ2–ʹ12 = 2.83
20. The first three moments about the working mean 44.5 of a distribution are – 0.4, 2.99,
– 0.08 then value of moment about mean 3 is
(a) 2.83 (b) 4.30 (c) 3.38 (d) 30.3
Ans.: (c)
Explanation : Given A = 44.5
ʹ1 = –0.4 ,ʹ2 = 2.99 , ʹ3 = – 0.008
3 = ʹ3= 3ʹ2ʹ1 + 2 ʹ13 = 3.38
21. The first four moments about the working mean 44.5 of a distribution are – 0.4, 2.99,
0.08 and 27.63 then value of moment about mean 4 is ________
(a) 15.19 (b) 30.30 (c) –29.20 (d) 37.99
Ans.: (b)
Explanation : Given A = 44.5
ʹ1 = –0.4 ,ʹ2 = 2.99 , ʹ3 = – 0.008, ʹ4 = 27.63
4 = ʹ4 – 4 ʹ3ʹ1 + 6 ʹ2ʹ22– 3 ʹ14 = 30.30
22. The first four moments about the working mean 44.5 of a distribution are – 0.4, 2.99,
– 0.04 and 27.63 then distribution is ________
(a) platykurtic (b) Mesokurtic (c) Leptokurtic (d) Equal distribution
Ans.: (c)

Gigatech Publication House


Igniting Minds
Engineering Mathematics –III M3.8 Statistics (MCQ’s)

Explanation : Given A = 44.5


'
1 = –0.4 ,ʹ2 = 2.99 , ʹ3 = – 0.008, ʹ4 = 27.63
2 = ʹ2–ʹ12 = 2.83
4 = ʹ4 – 4 ʹ3ʹ1 + 6 ʹ2ʹ12– 3 ʹ14 = 30.30
4
2 = 2= 3.78
2
⸫ Distribution is Leptokurtic.
23. The first three moments about the value 2 of a distribution are 1, 16 and – 40 then
value of A.M. is ________
(a) 2 (b) 4 (c) 3 (d) – 2
Ans.: (c)
Explanation : Given A = 2
ʹ1 = 1 ,ʹ2= 16 , ʹ3 = – 40
A.M = A + ʹ1 = 3
24. The first three moments about the value 2 of a distribution are 1, 16 and – 40 then
value of standard deviation () is ________
(a) 3.87 (b) 4.03 (c) 13.31 (d) – 4.09
Ans.: (a)
Explanation : Given : A = 2
ʹ1 = 1 ,ʹ2 = 16 , ʹ3 = – 40
2 = ʹ2–ʹ12 = 15
⸫ S.D = 
= 2 = 3.87
25. The first three moments about the value of 2 of a distribution are 1, 16 and – 40 then
the value of coefficient of skewness1 of distribution is ________
(a) 0.05 (b) 2.19 (c) – 0.05 (d) 1.28
Ans.: (b)
Explanation :Given : A = 2
ʹ1 = 1 ,ʹ2 = 16 , ʹ3 = – 40
and 2 = ʹ2–ʹ12 = 15
3 = ʹ3 – 3 ʹ2ʹ1 + ʹ13 = – 86
2
3
1 = 3
2
= 2.19
Gigatech Publication House
Igniting Minds
Engineering Mathematics –III M3.9 Statistics (MCQ’s)

26. For the data of distribution :


n = 100, fd = 50,  fd2 = 1970,  fd3 = 2948 where d = X – 48 then value of
third moment about mean 3 = ________
(a) 0.18 (b) 1.20 (c) – 0.37 (d) – 1.49
Ans.: (a)
Explanation : Given n = 100 and A = 48
 fd 50
ʹ1 = n = 100 = 0.5
1970
ʹ2 = 100 = 19.7
2948
ʹ3 = 100 = 29.48
3 = ʹ3 – 3 ʹ2ʹ1 + 2 ʹ13 = 0.18
27. For the data of distribution : n = 100, fd = 50,  fd2 = 1970,  fd3 = 2948
 fd4 = 86752 where d = X – 48 then value of moment about mean 4 = ________
(a) 187.32 (b) 837.92 (c) 83.31 (d) 29.29
Ans.: (b)
Explanation :
Given : n = 100 and A = 48
 fd
50 1970
ʹ1 = n = 100 = 0.5 ; ʹ2 = 100 = 19.7
2948 86752
ʹ3 = 100 = 29.48 ; ʹ4 = 100 = 867.52
4 = ʹ4– 4ʹ3ʹ1+ 6 ʹ2ʹ12 – 3 ʹ14 = 837.92
28. For the data of distribution : n = 100, fd = 50,  fd2 = 1970, d = X – 48 then
value of moment about mean 2 = ________
(a) 19.45 (b) 15.32 (c) 12.39 (d) 35.32
Ans.: (a)
Explanation :
Given : n = 100 and A = 48
 fd
50
ʹ1 = n = 100 = 0.5 ;
1970
ʹ2 = 100 = 19.7
2 = ʹ2–ʹ12 = 19.45

Gigatech Publication House


Igniting Minds
Engineering Mathematics –III M3.10 Statistics (MCQ’s)

29. For the data of distribution : n = 100, fd = 50,  fd2 = 1970,  fd3 = 2948,
 fd4 = 86752 where d = X – 48 then value of coefficient of skewness1 is ________
(a) 0.23 (b) 0.0000044 (c) – 0.20 (d) 0.35
Ans.: (b)
Explanation :
3 = ʹ3– 3 ʹ2ʹ1+ 2ʹ13 and A = 48
 fd
50 1970
ʹ1 = n = 100 = 0.5 ; ʹ2 = 100 = 19.7
2948 86752
ʹ3 = 100 = 29.48 ; ʹ4 = 100 = 867.52
2 = ʹ2 – 3 ʹ12= 19.45
3
3 = ʹ3 – 3 ʹ2ʹ1+ 21 = 0.18
2
3
1 = 3 = 0.0000044 or 1 = 4.40  10–6
2
30. For the data of distribution : n = 100, fd = 50,  fd2 = 1970,  fd3 = 2948,
 fd4 = 86752 where d = X – 48 then value of coefficient of Kurtosis 2 is ________
(a) 3.31 (b) 2.21 (c) 2.71 (d) 3.94
Ans.: (b)
Explanation :
Given : n = 100 and A = 48
 fd
50 1970
ʹ1 = n = 100 = 0.5 ; ʹ2 = 100 = 19.7
2948 86752
ʹ3 = 100 = 29.48 ; ʹ4 = 100 = 867.52

2 = ʹ2–ʹ22 = 19.45
4 = ʹ4– 4ʹ3ʹ1 + 6 ʹ2ʹ12– 3ʹ14 = 837.92
4
2 = 2 = 2.21
2
31. The first four moments about working mean 30.2 of a distribution are 0.255, 6.222,
30.211 and 400.25 then central moment 2 is ________
(a) 7.32 (b) 1.92 (c) 6.16 (d) 3.62
Ans.: (c)

Gigatech Publication House


Igniting Minds
Engineering Mathematics –III M3.11 Statistics (MCQ’s)

Explanation :
Given : A = 30.2
ʹ1 = 0.255 ; ʹ2 = 6.222
ʹ3 = 30.211 ; ʹ4 = 400.25
2 = ʹ2–ʹ12 = 6.16
32. The first four moments about working mean 30.2 of a distribution are 0.255, 6.222,
30.211 and 400.25 the central moment 3 is ________
(a) 25.48 (b) 11.35 (c) 32.29 (d) 17.32
Ans.: (a)
Explanation:
Given : A = 30.2
ʹ1 = 0.255 ; ʹ2 = 6.222
ʹ3 = 30.211 ; ʹ4 = 400.25
3 = ʹ3– 3ʹ2ʹ1 +2 ʹ13 = 25.48
33. The first four moments about working mean 30.2 of a distribution are 0.255, 6.222,
30.211 and 400.25 the central moment 4 is ___
(a) 371.85 (b) 341.57 (c) 270.71 (d) 291.53
Ans.: (a)
Explanation:
Given : A = 30.2
ʹ1 = 0.255 ; ʹ2 = 6.222
ʹ3 = 30.211 ; ʹ4 = 400.25
4 = ʹ4– 4ʹ3ʹ1 + 6 ʹ2ʹ12– 3 ʹ14= 371.85
34. The first four moments about working mean 30.2 of a distribution are 0.255, 6.222,
30.211 and 400.25 then value of coefficient of skewness1 is____
(a) 3.31 (b) 0.07 (c) 2.78 (d) 0.0
Ans.: (c)
Explanation :
Given : A = 30.2
ʹ1 = 0.255 ; ʹ2 = 6.222
ʹ3 = 30.211 ; ʹ4 = 400.25
2 = ʹ2–ʹ12 = 6.16 ;

Gigatech Publication House


Igniting Minds
Engineering Mathematics –III M3.12 Statistics (MCQ’s)

' 3
3 = 3– 3ʹ2ʹ1 2 1 = 25.48
2
3
1 = 3 = 2.78
2
35. The first four moments about working mean 3.02 of a distribution are 0.255, 6.222,
30.211 and 400.25 then value of coefficient of skewness 2 is________
(a) 2.99 (b) 6.20 (c) 3.51 (d) 9.80
Ans.: (d)
Explanation : Given : A = 30.2
ʹ1 = 0.255 ; ʹ2 = 6.222
ʹ3 = 30.211 ; ʹ4 = 400.25
2 = ʹ2– ʹ12 = 6.16 ;
'
4 = 4 – 4ʹ3ʹ1 + 6 ʹ2ʹ12– 3ʹ14 = 371.85
4
2 = 3= 9.80
2
36. The first four moments about working mean of the distribution are 0, 2.5, 0.7 and
18.75 then moment about mean 2 is ________
(a) 11.5 (b) 12.5 (c) 2.5 (d) 7.5
Ans.: (c)
Explanation :
ʹ1 = 0 ; ʹ2 = 2.5
ʹ3 = 0.7 ; ʹ4 = 18.75
2 = ʹ2–ʹ12 = 2.5
37. The first four moments about working mean of the distribution are 0.25, 0.7 and
18.75 then coefficient of skewness 2 is ________
(a) 2.87 (b) 3 (c) 0.37 (d) 3.87
Ans.: (b)
Explanation :
ʹ1 = 0 ; ʹ2 = 2.5
ʹ3 = 0.7 ; ʹ4 = 18.75
2 = ʹ2–ʹ12 = 2.5 ;
3 2 3
2 = ʹ4– 4ʹ3ʹ1 + 6 21– 31 = 18.75
4
2 = 2= 3
2

Gigatech Publication House


Igniting Minds
Engineering Mathematics –III M3.13 Statistics (MCQ’s)

38. The first four moment of a distribution about the value 5 are 2, 20, 40 and 50 then
the value of central moment 3 is ________
(a) 57 (b) 25 (c) – 64 (d) 40
Ans. (c)
Explanation :
ʹ1 = 2 ; ʹ2 = 20
ʹ3 = 40 ; ʹ4 = 50
3 = ʹ3– 3ʹ2ʹ1 + 2 ʹ3
1 = – 64
39. The first four moment of a distribution about the value 5 are 2, 20, 40 and 50 then
the value of central moment 4 is ________
(a) 50 (b) 157 (c) 22.39 (d) 162
Ans.: (d)
Explanation :
A = 5
ʹ1 = 2 ; ʹ2 = 20
ʹ3 = 40 ; ʹ4 = 50
4 = ʹ4– 4ʹ3ʹ1 + 6ʹ2ʹ12– 3 ʹ14= 162
40. The first four moments about working mean of the distribution are 0, 2.5, 0.7 and
18.75 then moment about mean 3 is ________
(a) 0.7 (b) 5.7 (c) 0.32 (d) 1.32
Ans.: (a)
Explanation :
ʹ4 = 0 ; ʹ2 = 2.5
ʹ3 = 0.7 ; ʹ4 = 18.75
3 = ʹ3– 3ʹ2ʹ1 + 2 ʹ13 = 0.7
41. The first four moments about working mean of the distribution are 0, 2.5, 0.7
and18.75 then moment about mean 4 is ________
(a) 18.75 (b) 22.35 (c) 17.32 (d) 91.40
Ans.: (a)
Explanation :
ʹ4 = 0 ; ʹ2 = 2.5
ʹ3 = 0.7 ; ʹ4 = 18.75
4 = ʹ4– 4ʹ3ʹ1 + 6ʹ2ʹ12– 3ʹ14 = 18.75

Gigatech Publication House


Igniting Minds
Engineering Mathematics –III M3.14 Statistics (MCQ’s)

42. The first four moments about working mean of the distribution are 0, 2.5, 0.7, and
18.75 then coefficient of skewness 1 is ________
(a) 0.0 (b) 0.51 (c) 0.03 (d) – 0.5
Ans.: (c)
Explanation
ʹ4 = 0 ; ʹ2 = 2.5
ʹ3 = 0.7 ; ʹ4 = 18.75
2 = ʹ2– ʹ12 = 2.5 ;
3
3 = ʹ3– 3ʹ2ʹ1 + 2 1 = 0.7
2
3
1 = 3 = 0.03
2
43. The first two moments about the working mean 30.2 of a distribution are0.255,
6.222 then value of S.D. () is ________
(a) 3 (b) 3.22 (c) 2.48 (d) 5.7
Ans.: (c)
Explanation :
A = 30.2
ʹ1 = 0.255 ʹ2 = 6.222
S.D. () = 2 and 2 = ʹ2–ʹ22 = 6.16
S.D = 2.48
44. The first two moments about working mean 44.5 of a distribution are– 0.4, 2.99
then the value of the arithmetic mean (A.M) is ________
(a) 44.1 (b) 42.5 (c) 1.35 (d) 44.5
Ans.: (a)
Explanation :
A = 44.5
ʹ1 = –0.4 ʹ2 = 2.99
A.M. = A + ʹ1 = 44.1
45. The first two moments about the working mean 44.5 of a distribution are – 0.4, 2.99
then the value of standard deviation (S.D) () is ________
(a) 2.32 (b) 2.87 (c) – 0.32 (d) 1.68
Ans.: (d)
Gigatech Publication House
Igniting Minds
Engineering Mathematics –III M3.15 Statistics (MCQ’s)

Explanation : A = 44.5
ʹ1 = –0.4 ʹ2 = 2.99
S.D. () = 2 and 2 = ʹ2–ʹ12 = 2.83
S.D. = 1.68
46. The first two moments of a distribution about the value 5 are 2, 20 then A.M. is _____
(a) 16 (b) 7 (c) 4 (d) 13
Ans.: (b)
Explanation :
A = 5
ʹ1 = 2 ; ʹ2 = 20
A.M. = A + ʹ1 = 7
47. The first two moments about the working mean 5 of a distribution are 2, 20 then
standard deviation () is ________
(a) 7 (b) 9 (c) 4 (d) 16
Ans.: (c)
Explanation :
A = 5;
ʹ1 = 2 ʹ2 = 20
2 = ʹ2–ʹ22 = 16
S.D. () = 2 = 4
48. The value of central moment 2 of the following distribution is
x 1 2 3 4 5
f 6 15 23 42 62
(a) 2.75 (b) 3.01 (c) – 1.72 (d) 1.34
Ans.: (d)
Explanation : 2 = ʹ2–ʹ12

 f1 d1
2
'  f1 d1 '
And 1 = And2 =
 f1  f1
2
x f d=x–A f1 d1 f1 d1
A=3
1 6 –2 – 12 24
2 15 –1 – 15 15

Gigatech Publication House


Igniting Minds
Engineering Mathematics –III M3.16 Statistics (MCQ’s)

3 23 0 0 0
4 42 1 42 42
5 62 2 124 248
 148 – 139 329
ʹ1 = 0.939 ʹ2 = 2.22
2 = ʹ2–ʹ22 = 1.34
 
For the data : n = 10; u = – 5.1, v = –10, ui vi = = 1242,ui = 1169,  vi = 1694
2 2
49.
the value of coefficient of correlation r is ____
(a) 0.74 (b) 0.92 (c) 0.65 (d) 0.89
Ans.: (b)
Explanation :
n = 10,‒ u = – 5.1, ‒ v = –10, u v = 1242
i i

 = 1169 ,
2 2
ui vi = 1694
cor (u,v) 1
r = and cov (u, v) = n uivi– ‒
u‒v = 73.2
uv
1
u = n  ui –‒
2 2
u2 = 90.89 u = 9.53
1
v = n  vi –‒
2 2
u2 = 69.4 v = 8.33
73.2
⸫ r = (9.53) (8.33) = 0.92

50. For a given distribution, if value of cov (x, y) = – 5.2, x = 2.82 the value of
regression coefficient byx is ________
(a) – 0.55 (b) – 0.85 (c) – 0.75 (d) – 0.65
Ans.: (d)
Explanation :
cor (x, y) – 5.2
We have byx = 2 = (2.85)2 = – 0.65
x
51. For a given distribution cov (x, y) = 35.25, y = 5.82 the value of regression
coefficient bxy is ____
(a) 1.80 (b) 2.8 (c) 1.04 (d) 2
Ans.: (c)
Explanation :
cor (x, y) 35.25
We have byx = 2 = (5.82)2 = 1.04
y

Gigatech Publication House


Igniting Minds
Engineering Mathematics –III M3.17 Statistics (MCQ’s)

52. Given n = 25, x = 75, y = 100, x2 = 250, y2 = 500, xy = 325 then value of
coefficient of correlation r is ___
(a) – 0.7 (b) 0.8 (c) 0.5 (d) 0.9
Ans.: (c)
Explanation :
Given n = 25, x = 75, y = 100, x2 = 250,y2 = 500, xy = 325
cor (x, y)
r = ;
xy
 x
x = n = 3;
 y
y n =4=
1 
Cor (x, y) = n xy– x y = 1
1 
x = n x2–x 2= 1 x = 1
2

1 
y = n y2– y 2= 1 y = 4
2

⸫ r = 0.5
53. The two regression equation of the variables x and y are x = 19.3 – 0.87 y, y = 11.64
 
– 0.50 x then values of x and y are ________
   
(a) x = 11.64 ; y = 19.3 (b) x = 0.50 ; y = 0.87
   
(c) x = 16.23 ; y = 3.52 (d) x = 17.3 ; y = 4.50
Ans.: (c)
Explanation : Two regression lines are
x = 19.3 – 0.87 y
y = 11.64 – 0.50 x
⸫ x + 0.87y = 19.3 ; 0.50x + y = 11.64
 
Sincce x and y satisfies these equations
 
x + 0.87 y = 19.3
 
0.50 x + y = 11.64
 
y = 16.23, y = 3.52
54. The two regression equation of the variables x and y are x = 19.3 – 0.87 ; y = 11.64
– 0.50 x then correlation coefficient between x and y is________
(a) 0.8 (b) 0.7 (c) – 0.66 (d) + 0.66
Ans.: (c)
Gigatech Publication House
Igniting Minds
Engineering Mathematics –III M3.18 Statistics (MCQ’s)

Explanation :Two regression lines are


x = 19.3 – 0.87 y ; y = 11.64 – 0.50 x
⸫ bxy = – 0.87 byx = – 0.50
r = bxy byx = 0.66
Since both bxy and byx are negative
⸫ r = – 0.66
55. Two regression lines are 5y – 8x + 17 = 0 and 2y – 5x + 14 = 0 then mean values of
x and y are ________
       
(a) x = 3 ; y = 4 (b) x = 8 ; y = 2 (c) x = 5 ; y = 5 (d) x = 4 ; y = 3
Ans.: (d)
Explanation :
 
Let ( x , y ) be the point of intersection of lines (1) and (2),
 
Therefore, ( x , y ) satisfies the Equations (1) and (2)
 
⸫ put x = x and y = y in equations (1) and (2)
   
⸫ 8x – 5y = 17 and 5 x – 2 y = 14
Solving simultaneously, we get
 
x = 4; y = 3
 
Mean x = 4 y = 3
2
56. Two regression lines are 5y – 8x + 17 = 0 and 2y – 5x + 14 = 0 also y = 16 then
variance of x is ________
(a) 8 (b) 2 (c) 16 (d) 4
Ans.: (d)
Explanation : To find x
bxy by (0.4)  (4)
x = r (x, y) = 0.8 =2

(Take r from above example)


2
x = 4
57. Two regression lines are 8x – 10y + 66 = 0 ; 40x – 18y = 214 then the mean values
of x and y are ________
       
(a) x = 8, y = 18 (b) x = 40, y = 10 (c) x = 13, y = 17(d) x = 9, y = 11
Ans.: (c)
Explanation : The regression lines are
8x – 10y + 66 = 0 and 40x – 18y = 214
 
Since mean values of x and y are x , y satisfies these equations.
Gigatech Publication House
Igniting Minds
Engineering Mathematics –III M3.19 Statistics (MCQ’s)
   
8 x – 10 y = – 66  40 x – 18 y = 214
 
⸫ x = 13 ; y = 17
58. The equation of two regression lines obtained in correlation analysis are 4x – 5y +
33 = 0;
20x – 9y – 107 = 0, then the correlation coefficient between x and y is ________
(a) 0.7 (b) 0.6 (c) – 0.8 (d) 9
Ans.: (b)
Explanation : Let us consider the lines of regression of y on x is from Equation (1),
4 33 33
y = 5 x + 5 = 0.8x + 5

⸫ byx = 0.8
And the lines of regression of x on y is from equation (2),
9 107 107
x = 20 y + 20 = 0.45y + 20

⸫ bxy = 0.45
⸫ Correlation co–efficient r (x, y)
r (x, y) = byxbxy = 0.8  (0.45)
r = 0.6
59. The equation of two regression line obtained in a correlation analysis are 4x – 5y +
33 = 0; 20x – 9y – 107 = 0 and variance of y is 16 then variance of x is ________
(a) 9 (b) 8 (c) 3 (d) 4
Ans.: (a)
Explanation:
Since y = 4
bxyy (0.45) (4)
x = r (x, y) = 0.6 =2

(Take r from above example)


2
x = 3 variance in x is x
2
x = 9
60. If two lines of regression are 9x + y – = 0 and 4x + y =  and mean of x and y are
 
respectively x – 2, y = – 3 then values of  and  are
(a)  = 15;  = 5 (b)  = 2 ; = – 3 (c)  = 19 ;  = 17(d)  = 12 ;  = 15
Ans.: (a)
 
Explanation : Let ( x , y ) be the point of intersection of given two regression lines
 
Put x = x and y = y
Gigatech Publication House
Igniting Minds
Engineering Mathematics –III M3.20 Statistics (MCQ’s)
 
⸫ 9x + y =   9 (2) – 3 =  = 15 and
 
4x + y =   4 (2) – 3 =  = 5
 
61. If two lines of regression are 9x + y – 15 = 0 and 4x + y = 5 and x = 2, y = – 3
then coefficient of correlation r is ________
(a) – 0.8 (b) – 0.67 (c) 0.67 (d) 0.8
Ans.: (b)
Explanation :Let the regression line of y on x is form equation (4),
y = – 4x + 15
 byx = regression coefficient of y on x
byx = – 4
and the regression line of x on y is from equation (3)
y 15 1
x = –9 – 9 ⸫ bxy = –9

= – 0.1111 is the coefficient of regression of x on y


r = bxy byx = 4  0.1111
r (x, y) = 0.6666
62. If regression coefficient y on x is – 0.50 and x on y is –0.87 then coefficient of
correlations is ________
(a) 0.8 (b) –0.66 (c) –0.3 (d) 0.9
Ans.: (b)
Explanation :
byx = – 0.50 and bxy = – 0.87
r = bxy byx = (–0.50)  (–0.87) = 0.66
Since both byx and bxy are negative
⸫ r = –0.66
63. If the two regression coefficient are 0.8 and0.45 then coefficient of correlation r is ____
(a) 0.3 (b) 0.8 (c) 0.9 (d) 0.6
Ans.: (d)
Explanation : byx = 0.8 and bxy = 0.45
r = bxy byx = (0.8 0.45) = 0.6

Gigatech Publication House


Igniting Minds
Engineering Mathematics –III M3.21 Statistics (MCQ’s)

Test Your Knowledge

1. The standard deviation () of a distribution 2, 3, 5 is given by


(a) 2 (b) 3.88 (c) 1.25 (d) 4.12
2. For a given distribution if value of  fx2 = 122 , N =  f = 5, x– = 4 then value of
standard deviation () is given by
(a) 2 (b) 3.88 (c) 5.13 (d) 2.90
3. From the following data
Team S.D () A.M ‒ x
A 2 5
B 2.5 4
The more consistent team is
(a) Team A (b) Team B
(c) Team equally consistent (d) can’t say
4. If arithmetic mean of four numbers is 15, one item 19 is replaced by 23, then new
arithmetic mean is,
(a) 16 (b) 17 (c) 18 (d) 20
5. If the first four moments of a distribution about the working mean A = 2 are 1, 5, –
20, 10 then the third moment 3 about mean of distribution is given by
(a) – 10 (b) 30 (c) – 33 (d) – 20
6. The first four moments about the working mean 44.5 of a distribution are – 0.4, 2.99, –
0.08 and 27.63 then value of moments about mean 2 is ________
(a) 2.83 (b) 3.99 (c) 2.2 (d) 5.9
7. The first three moments about the value 2 of a distribution are 1, 16 and – 40 then
value of standard deviation () is _____
(a) 3.87 (b) 4.03 (c) 13.31 (d) – 4.09
8. For the data of distribution : n = 100, fd = 50,  fd2 = 1970,  fd3 = 2948,
 fd4 = 86752 where d = X – 48 then value of coefficient of skewness1 is ________
(a) 0.23 (b) 0.0000044 (c) – 0.20 (d) 0.35
9. The first four moments about working mean 30.2 of a distribution are 0.255, 6.222,
30.211 and 400.25 the central moment 4 is ________
(a) 371.85 (b) 341.57 (c) 270.71 (d) 291.53
10. The first four moments about working mean of the distribution are 0.25, 0.7 and
18.75 then coefficient of skewness2 is ________
(a) 2.87 (b) 3 (c) 0.37 (d) 3.87

Gigatech Publication House


Igniting Minds
Engineering Mathematics –III M3.22 Statistics (MCQ’s)

11. The first two moments about the working mean 30.2 of a distribution are 0.255,
6.222 then value of S.D. () is ________
(a) 3 (b) 3.22 (c) 2.48 (d) 5.7
12. The value of central moment 2 of the following distribution is
x 1 2 3 4 5
f 6 15 23 42 62
(a) 2.75 (b) 3.01 (c) – 1.72 (d) 1.34
13. Given n = 25, x = 75, y = 100, x2 = 250, y2 = 500, xy = 325 then value of
coefficient of correlation r is ________
(a) – 0.7 (b) 0.8 (c) 0.5 (d) 0.9
2
14. Two regression lines are 5y – 8x + 17 = 0 and 2y – 5x + 14 = 0 also y = 16 then
variance of x is ________
(a) 8 (b) 2 (c) 16 (d) 4
15. If two lines of regression are 9x + y –  = 0 and 4x + y =  and mean of x and y are
 
respectively x – 2, y = – 3 then values of  and  are
(a)  = 15;  = 5 (b)  = 2 ; = – 3 (c)  = 19 ;  = 17(d)  = 12 ;  = 15
16. The probability of drawing Ace from a well shuttled pack of cards is ________
1 3 3 1
(a) 52 (b) 13 (d) 52 (d) 13
17. Two dice are thrown. The probability of getting double is
1 1 4 1
(a) 6 (b) 36 (c) 36 (d) 3
18. If probability of success p = 0.7 then probability of failure q = ________
(a) 0.7 (b) 1.7 (c) – 0.7 (d) 0.3
19. Two dice are thrown at a time. What is the probability of getting 10 points.
2 1 1 1
(a) 3 (b) 4 (c) 6 (d) 12
20. 20% of bolts produced by machine are defective. The mean and standard deviation
of defective bolts in total of 900 bolts are respectively.
(a) 180 and 12 (b) 12 and 100 (c) 12 and 180 (d) 9 and 0.8
21. Slope of regression line of x on y is
x x y
(a) r (b) r(x, y) (c) (d)
y y x
22. In regression line y on x, byx is given by
cov(x, y) cov(x, y)
(a) cov (x, y) (b) r(x, y) (c) 2 (d) 2
x y
23. If the two regression coefficient are 0.16 and 4 then the correlation coefficient is
(a) 0.08 (b) –0.8 (c) 0.8 (d) 0.64
Gigatech Publication House
Igniting Minds
Engineering Mathematics –III M3.23 Statistics (MCQ’s)

24. If covariance between x and y is 10 and the variance of x and y are 16 and 9
respectively then coefficient of correlation r(x, y) is
(a) 0.833 (b) 0.633 (c) 0.527 (d) 0.745
25. Given the following data r = 0.5, xy = 350, x = 1, y = 4, –x = 68, –
y = 62.125. The
value of n (number of observation) is
(a) 5 (b) 7 (c) 8 (d) 10

Answers Key

1. (c) 2. (a) 3. (a) 4. (a) 5. (c) 6. (a) 7. (a) 8. (b) 9. (a) 10. (b)

11. (c) 12. (d) 13. (c) 14. (d) 15. (a) 16. (d) 17. (a) 18. (d) 19. (d) 20. (a)

21. (a) 22. (c) 23. (c) 24. (a) 25. (a)



Gigatech Publication House


Igniting Minds

You might also like