Statistics
1.0Introduction
It is the branch of mathematics which deals with the collection, presentation, analysis and interpretation of numerical data.
In singular form, statistics is taken as a subject. And, in plural form, statistics means data.
Class interval: Each group into which the raw data is condensed, is called a class-interval. Class limits: Each class is bounded by two figures, which are called class limits. The figure on the left side of a class is called its lower limit and that on its right is called its upper limit.
Exclusive form (or continuous interval form): A frequency distribution in which the upper limit of each class is excluded and lower limit is included, is called an exclusive form.
Example:
Suppose the marks obtained by some students in an examination are given. We may consider the classes 0−10,10−20 etc. In class 0−10, we include 0 and exclude 10. In class 10−20, we include 10 and exclude 20.
Inclusive form (or discontinuous interval form):
A frequency distribution in which each upper limit as well as lower limit is included, is called an inclusive form. Thus, we have classes of the form 0−10,11−20,21−30 etc.
In 0−10, both 0 and 10 are included.
2.0Important terms related to grouped data:
Class boundaries or true upper and true lower limits:
(i) In the exclusive form, the upper and lower limits of a class are respectively known as the true upper limit and true lower limit.
(ii) In the inclusive form, the number midway between the upper limit of a class and lower limit of the subsequent class gives the true upper limit of the class and the true lower limit of the subsequent class. Thus, in the above table of inclusive form, we have: true upper limit of class 1−10 is (210+11)=10.5 and true lower limit of class 11−20 is 10.5 . Similarly, true upper limit of class 11−20 is (220+21)=20.5, and true lower limit of class 21−30 is 20.5 .
Class size
The difference between the true upper limit and the true lower limit of a class is called its class size.
Class mark of a class
Class mark =(2 True upper limit + True lower limit )
- The difference between any two successive class marks gives the class size.
- Average is the statistic which describes the center of a set of data, a set of numbers which are measurements or counts.
The most commonly used averages are the mean (arithmetic average), mode (most frequent number), median (middle number when numbers are listed smallest to largest).
Numerical Ability 1
The class marks of a frequency distribution are 7,13,19,25,31,37,43. Find the class-size and all the class-intervals.
Solution:
Class size = Difference between two successive class-marks =(13−7)=6.
Let the lower limit of the first-class interval be a. Then, its upper limit =(a+6).
∴2a+(a+6)=7⇒2a=8⇒a=4
So, the first class-interval is 4-10.
Let the lower limit of last class-interval be b.
Then, its upper-class limit =(b+6).
∴2b+(b+6)=43⇒2 b=80⇒ b=40.
So, the last class-interval is 40-46.
Hence, the required class-intervals are
4−10,10−16,16−22,22−28,28−34,34−40 and 40−46.
An average tends to lie centrally with the values of the variable arranged in ascending order of magnitude. So, we call an average a measure of central tendency of the data.
Three measures of central tendency are useful for analysing the data, namely
(a) Mean
(b) Median
(c) Mode
- We know that the mean of observations is the sum of the values of all the observations divided by the total number of observations i.e., if x1,x2,x3,…...,xn are n observations, then mean xˉ=nx1+x2+x3+….+xn or xˉ=n∑i=1nxi, where ∑i=1nxi denotes the sum x1+x2+x3+…..+xn.
3.0Arithmetic mean
The arithmetic mean of grouped data may also be calculated by any one of the following methods:
- Direct method
- Assumed-mean method
Direct method
x1,x2,x3,……..Xn are observations with respective frequencies f1,f2,f3,……...fn then mean, (x) is defined by
xˉ=f1+f2+f3+….+fnf1x1+f2x2+f3x3+….+fnxn or xˉ=∑i=1nfi∑i=1nfixi,
Where ∑i=1nfi=f1+f2+f3+…..+fn=N
The following steps should be followed in finding the arithmetic mean of grouped data by direct method.
Step-1: Find the class mark ( xi ) of each class using,
xi=2 Lowerlimit + Upperlimit
Step - 2: Calculate fiXi for each i
Step - 3: Use the formula : mean xˉ=∑i=1nfi∑i=1nfixi,
- Sum of first ' n ' natural numbers is
2n(n+1)
Numerical Ability 2
Find the mean of the following data:
Solution:
We may prepare the table as given below:
∴ Mean, xˉ=∑i=1nfi∑i=1nfixi=40856=21.4
Numerical Ability 3
The following distribution shows the daily pocket allowance of children of a locality. The mean pocket allowance is Rs. 18. Find the missing frequency f.
Solution:
We may prepare the table as given below:
∴ Mean xˉ=∑i=1nfi∑i=1nfixi=44+f752+20f
Given, mean =18∴18=44+f752+20f
⇒792+18f=752+20f ⇒2f=40
⇒f=20
- The sum of squares of ' n ' natural numbers is 6n(n+1)(2n+1)
Numerical Ability 4
Find the missing frequencies f1 and f2 in the table given below, it is being given that the mean of the given frequency distribution is 50 .
Solution:
We may prepare the table as given below:
∴ Mean, x=∑fi∑fixi=68+f1+f23480+30f1+70f2
Given, mean =50
∴50=68+f1+f23480+30f1+70f2
⇒3400+50f1+50f2=3480+30f1+70f2
⇒20f1−20f2=80
⇒f1−f2=4
And Σfi=68+f1+f2
∴120=68+f1+f2[∵Σf1=120]
⇒f1+f2=52
Adding (i) and (ii), we get 2f1=56
⇒f1=28
⇒f2=24
Hence, the missing frequencies f1 and f2 are 28 and 24 respectively.
Numerical Ability 5
Find the mean of the following distribution.
Solution:
∴ Mean,
x=∑fi∑fii=10011220=112.2
Numerical Ability 6
Find the mean marks from the following data:
Solution:
We may prepare the table as given below:
∴ Mean,
x=∑fi∑fixi=854105=48.29
If less than type or more than type frequency distribution table is given then convert table is usual form to find the mean.
Numerical Ability 7
Find the mean marks of students from the adjoining frequency distribution table.
Solution:
We may prepare the table as given below:
∴ Mean,
x=∑fi∑fixi=804090=51.125=51.1 (approx)
Assumed mean method
In this case, to calculate the mean, we follow the following steps:
Step - 1: Find the class mark xi of each class using xi=2 Lower limit + Upper limit
Step - 2 : Choose a suitable value of xi in the middle as the assumed mean and denote it by 'a'.
Step-3: Find di=xii−a for each i
Step-4: Find fi×di for each i
Step-5: Find N=∑fi
Step - 6 : Calculate the mean, (xˉ) by using the formula xˉ=a+N∑fidi
Numerical Ability 8
The following table gives the marks scored by 100 students in a class test:
Find the mean marks scored by a student in class test.
Solution:
We may prepare the table with assumed mean, a=35 as given below:
∴ Mean, x=a+N∑fidi
=35+100(−700)=35−7=28
Numerical Ability 9
Thirty women were examined in a hospital by a doctor and the number of heart beats per minute were recorded and summarised as follows. Find the mean heart beats per minute for these women, by using assumed-mean method.
Solution:
We may prepare the table with assumed mean, a=75.5 as given below:
∴ Mean, x=a+N∑fidi=75.5+3012=75.5+52=75.9
Numerical Ability 10
Find the arithmetic mean of the following frequency distribution.
Solution:
The given series is in inclusive form. We will prepare the table in exclusive form with assumed mean a =42 as given below:
∴ Mean, x=a+ N∑fi di=42+70(−395)=702940−395=702545=36.36 (approx.)
- It is not necessary to convert the data in continuous interval form for finding mean.
Step-deviation method or short-cut method
Sometimes, the values of x and f are so large that the calculation of mean by assumed mean method becomes quite inconvenient. In this case, we follow the following steps:
Step-1 : Find the class mark xi of each class by using xi=2 lower limit + Upper limit
Step-2 : Choose a suitable values of xi in the middle as the assumed mean and denote it by 'a'.
Step-3 : Find h= (upper limit -lower limit) for each class.
Step-4 : Find ui=hxi−a for each class.
Step-5 : Find fiui for each i.
Step-6 : Calculate, the mean by using the formula =a+{NΣfi×ui}×h, where N=∑fi.
Numerical Ability 11
Find the mean of the following distribution by step-deviation method:
Solution:
We may prepare the table with assumed mean a=120 and h=20 as given below:
∴ Mean, x=a+ N∑fiui h=120+100(−39)×20=120−539=5561=112.2
- If class size is different then h is taken as HCF of the class marks.
Numerical Ability 12
Calculate the mean for the following frequency distribution (By step deviation method).
Solution:
Mean (x)=a+ΣfiΣfiui×h=200+(150−6)×80
=200−(52×8)=200−516=51000−16=5984=196.8
4.0Median
Median: It is a measure of central tendency which gives the value of the middle most observation in the data. In a grouped data, it is not possible to find the middle observation by looking at the cumulative frequencies as the middle observation will be some value in a class interval. It is, therefore, necessary to find the value inside a class that divides the whole distribution into two halves.
Median Class: The class whose cumulative frequency is greater than 2N is called the median class.
To calculate the median of a grouped data, we follow the following steps.
Step - 1: Prepare the cumulative frequency table corresponding to the given frequency distribution and obtain N=Σfi
Step-2: Find 2N
Step - 3: Look at the cumulative frequency just greater than 2N and find the corresponding class (Median class).
Step - 4: Use the formula Median =ℓ+{f2N−cf}×h
Where, ℓ= Lower limit of median class.
f= Frequency of the median class.
cf= Cumulative frequency of the class preceding the median class.
h= Size of the median class.
N=∑fi.
- Data must be in continuous interval form to find median and mode of grouped data.
Numerical Ability 13
Find the median of the following frequency distribution:
Solution:
At first, we prepare a cumulative frequency distribution table as given below:
Here, N = 100
∴2N=50
The cumulative frequency just greater than 50 is 64 and the corresponding class is 20−30.
So, the median class is 20−30.
∴ℓ=20, N=100,cf=28,f=36 and h=10 Therefore, median =ℓ+{f2N−cf}×h=20+(3650−28)×10=20+3622×10=20+955=9180+55=9235=26.1
- The median is the middle of a distribution: half the scores are above the median and half are below the median.
Numerical Ability 14
A health insurance agent found the following data for distribution of ages of 100 policy holders. Calculate the median age, if policies are given only to persons having age 18 years onwards but less than 60 years.
Solution:
From the given table we can find the frequency and cumulative frequencies as given below :
Here, N=100
∴2N=50
The cumulative frequency just greater than 50 is 78 and the corresponding class is 35−40.
So, the median class is 35−40.
∴ℓ=35 N=100,cf=45,f=33 and h=5
Therefore, median =ℓ+{f2N−cf}×h
=35+(3350−45)×5
=35+335×5
=331155+25=331180=35.76
Hence, the median age is 35.76 years.
Numerical Ability 15
From the following frequency distribution, calculate the median.
Solution:
Here 2N=2200=100
The median lies in the class 25−30.
ℓ=25, c.f. =77,f=42 and h=5
Calculation of Median
Applying the formula, Median =ℓ+f2N− c.f. ×h
We get, Median =25+42100−77×5
=25+42115=25+2.74=27.74
So, above half the student have scored marks less than 27.74 and the other half scored marks more than 27.74.
Numerical Ability 16
Calculate the missing frequency 'a' from the following distribution, it is being given that the median of the distribution is 24 .
Solution:
At first we prepare a cumulative frequency distribution table as given below :
Since the median is 24, therefore, the median class will be 20−30.
Hence, ℓ=20, N=55+a,cf=30,f=a and h=10
Therefore, median =ℓ+{f2N−cf}×h
⇒24=20+(a255+a−30)×10
⇒24=20+2a(a−5)×10
⇒4=a(a−5)×5
⇒4a=5a−25
⇒a=25
Hence, the value of missing frequency a is 25 .
Numerical Ability 17
The median of the following data is 525 . Find the values of x and y, if the total frequency is 100.
Solution:
At first we prepare a cumulative frequency distribution table as given below :
We have N=100
∴76+x+y=100
⇒x+y=24
Since the median is 525 , so, the median class is 500-600
∴ℓ=500, N=100,cf=36+x,f=20 and h=100
Therefore, median =ℓ+{f2N−cf}×h
⇒525=500+(2050−36−x)×100
⇒25=(14−x)×5
⇒5=14−x
⇒x=9
Also, putting x=9 in (i),
we get 9+y=24
⇒y=15
Hence, the values of x and y are 9 and 15 respectively.
5.0Mode
Mode : Mode is that value among the observations which occurs most often i.e. the value of the observation having the maximum frequency.
In a grouped frequency distribution, it is not possible to determine the mode by looking at the frequencies.
Modal Class : The class of a frequency distribution having maximum frequency is called modal class of a frequency distribution.
The mode is a value inside the modal class and is calculated by using the formula.
Mode =ℓ+{2f1−f0−f2f1−f0}×h
Where ℓ= Lower limit of the modal class.
h = Size of class interval
f1= Frequency of modal class
f0= Frequency of the class preceding the modal class
f2= Frequency of the class succeeding the modal class.
- A disadvantage of the mode is that many distributions have more than one mode. These distribution are called "multi modal" and is therefore not recommended to be used as the only measure of central tendency.
Numerical Ability 18
The following data gives the information on the observed lifetimes (in hours) of 225 electrical components:
Determine the modal lifetime of the components.
Solution:
Here the class 60-80 has maximum frequency, so it is the modal class.
∴ℓ=60, h=20,f1=61,f0=52 and f2=38
Therefore, mode =ℓ+{2f1−f0−f2f1−f0}×h
=60+(2×61−52−3861−52)×20
=60+329×20
=60+5.625
=65.625
Hence, the modal lifetime of the components is 65.625 hours.
Numerical Ability 19
Given below is the frequency distribution of the heights of players in a school.
Find the modal class.
Solution:
The given series is in inclusive form. We prepare the table in exclusive form, as given below :
Here, the class 165.5-168.5 has maximum frequency, so it is the modal class.
Numerical Ability 20
The mode of the following series is 36 . Find the missing frequency f in i.
Solution:
Since the mode is 36 , so the modal class will be 30−40
∴ℓ=30, h=10,f1=16,f0=f and f2=12
Therefore, mode =ℓ+{2f1−f0−f2f1−f0}×h
⇒36=30+(2×16−f−1216−f)×10
⇒6=(20−f)(16−f)×10
⇒120−6f=160−10f
⇒4f=40
⇒f=10
Hence, the value of the missing frequency f is 10 .
Numerical Ability 21
The mean and mode of a frequency distribution are 28 and 19 respectively. Then find the median.
Solution:
Mean =28 and Mode =19
Median = ?
Mode =3 Median -2 Mean
19=3 Median - 2(28)
3 Median = 19+56
Median =375
Median =25
6.0Graphical representation of cumulative frequency distribution
Cumulative frequency polygon curve (Ogive)
Cumulative frequency is of two types and corresponding to these, the ogive is also of two types.
• Less than ogive
• More than ogive
Less than ogive: To construct a cumulative frequency polygon and an ogive, we follow these steps :
Step-1: Mark the upper class limit along x -axis and the corresponding cumulative frequencies along y-axis.
Step-2: Plot these points successively by line segments. We get a polygon, called cumulative frequency polygon.
Step-3: Plot these points successively by smooth curves, we get a curve called cumulative frequency curve or an ogive.
More than ogive: To construct a cumulative frequency polygon and an ogive, we follow these steps:
Step-1: Mark the lower class limits along x-axis and the corresponding cumulative frequencies along y-axis.
Step-2: Plot these points successively by line segments, we get a polygon, called cumulative frequency polygon.
Step-3: Plot these points successively by smooth curves, we get a curve, called cumulative frequency curve or an ogive.
Application of an ogive
Ogive can be used to find the median of a frequency distribution. To find the median, we follow these steps.
Method-I
Step-1: Draw anyone of the two types of frequency curves on the graph paper.
Step-2: Compute 2N(N=Σfi) and mark the corresponding points on the y-axis.
Step-3: Draw a line parallel to x-axis from the point marked in step 2, cutting the cumulative frequency curve at a point P.
Step-4: Draw perpendicular PM from P on the x-axis. The x -coordinate of point M gives the median.
Method-II
Step-1: Draw less than type and more than type cumulative frequency curves on the graph paper.
Step-2: Mark the point of intersecting (P) of the two curves drawn in step 1.
Step-3: Draw perpendicular PM from P on the x-axis. The x - coordinate of point M gives the median.
Numerical Ability 22
The following distribution gives the daily income of 50 workers of a factory.
Convert the distribution above to a less than type cumulative frequency distribution and draw its ogive.
Solution:
From the given table, we prepare a less than type cumulative frequency distribution table, as given below:
Now, plot the points (120,12),(140,26),(160,34),(180,40) and (200,50).
Join these points by a freehand curve to get an ogive of 'less than' type.
Numerical Ability 23
The following table gives the weight of 120 articles :
Change the distribution to a 'more than type' distribution and draw its ogive.
Solution:
Plotting the points :
Numerical Ability 24
The annual profits earned by 30 shops of a shopping complex in a locality gives rise to the following distribution:
Draw both ogives for the data above. Hence, obtain the median profit.
Solution:
We have a more than type cumulative frequency distribution table. We may also prepare a less than type cumulative frequency distribution table from the given data, as given below:
Now, plot the points A (5,30), B (10,28), C (15,16),D(20,14),E(25,10),F(30,7) and G (35,3) for the more than type cumulative frequency and the points P(10,2),Q(15,14),R (20,16),S(25,20),T(30,23),U(35,27) and V(40,30) for the less than type cumulative frequency distribution table.
Join these points by a freehand to get ogives for 'more than' type and 'less than' type.
The two ogives intersect each other at point (17.5, 15).
Hence, the median profit is Rs. 17.5 lakhs.Numerical Ability 25
The following data gives the information on marks of 70 students in a periodical test: Draw a cumulative frequency curve for the given data and find the median.
Solution:
We have a less than cumulative frequency table. We mark the upper class limits along the x -axis and the corresponding cumulative frequencies (no. of students) along the y -axis. Now, plot the points (10,3),(20,11),(30,28),(40,48) and (50,70). Join these points by a freehand curve to get an ogive of 'less than' type.
Here, N=70
∴2N=35
Take a point A(0,35) on the y-axis and draw AP∥x-axis, meeting the curve at P.
Draw PM⊥x-axis, intersecting the x -axis, at M .
Then, 0M=33.
Hence, the median marks is 33 .7.0Memory map