Thursday 2 July 2020

UGC NET STATISTICS FOR ECONOMICS MATERIAL


MEASURES OF CENTRAL TENDENCY 
1. Arithmetic Mean: It is the most common form of average and is used only in case of 
quantitative data. It can be calculated for both grouped as well as ungrouped data. 
Properties of Arithmetic Mean: 
1. Sum of deviations of all items from the mean is zero, that is, ⅀(X- X ) = 0 
2. Sum of squared deviations from the mean is minimum. 
3. If all items are replaced by some number, then mean won’t change. 
4. If we add/subtract/multiply/divide all the items with a number, then the mean would be 
changed in the same manner. 
5. It is capable of further mathematical treatment like combined mean. 
Combined mean = ( X1N1 + X 2 N2) / (N1 + N2) 
Properties of Geometric Mean 
1. Geometric Mean is less than the arithmetic mean, that is, GM<AM 
2. The product of all items will remain the same if each item is replaced by the geometric mean.
3. The geometric mean of the ratio of two corresponding observations in a series is equal to the 
ratio of their geometric means. 
4. The geometric mean of the product of corresponding items in a geometric mean is equal to 
the product of their geometric means. 
Harmonic Mean: It is the reciprocal of the arithmetic mean of the reciprocal of all items in a 
series. Harmonic Mean is also applicable in case of quantitative data. 
 It is used less in economics because it gives largest weight to smallest items. It is used in 
probability of time and speed and also in finding their rates. 
Relationship between Arithmetic Mean, Geometric Mean and Harmonic Mean 
1. If all items are the same then, AM=GM=HM. If not, then, AM>GM>HM. 
2. GM = √𝑨𝑴 ∗ 𝑯𝑴
POSITIONAL AVERAGES 
1. Median (M): Median divides the series into two equal parts. The sum of deviations from the 
median will be minimum if we ignore the signs.
Quartiles: It divides the series into four equal parts.
Q1 = (N+1)/4 = First Quartile
Q2 = 3(N+1)/4 = Third Quartile
Second Quartile = Median
Deciles: They divide the series into ten equal parts. series.
Percentiles: They divide the series into hundred equal parts. 
 The percentiles and deciles are used in education statistics, and psychological statistics.
Mode (Z): It is the most frequently occurring item in the series. For individual and discrete series, 
mode is calculated using the most frequently occurring value, while for continuous series, we first 
identify the modal class with the highest frequency and then use the following formula:
MEASURES OF DISPERSION 
 Dispersion is the degree or amount by which items in a series are different from the central 
value. It will tell us the reliability and representative of an average calculated from the series. 
Range: It is the difference between the largest and the smallest value of a series. 
Range = L - S 
Coefficient of Range = (𝐋−𝐒) / (𝐋+𝐒) 
It must be noted that though Range is very easy to calculate, it is usually not used as a 
measure of variability because of its inherent feature of instability. Therefore, range is 
considered to be a very limited measure of variability. 
Quartile Deviation: It is the difference between the third quartile (Q3) and the first quartile (Q1).
Semi-Quartile Range = (Q3 – Q1)/2
Coefficient of Quartile Deviation =ππŸ‘−𝐐𝟏 / ππŸ‘+𝐐𝟏
Quartile deviation measures the variability of the middle 50 percent values, that is, which are 
between Q3 and Q1.
Mean or Average Deviation: It is the sum of items taken from any central tendency. 
 It is usually preferred from median because the deviations from the median are minimum 
when ignoring signs. 
Coefficient of Mean Deviation Formula 
From Mean Mean Deviation from mean/Mean 
From Median Mean Deviation from mean/Median 
From Mode Mean Deviation from mean/Mode 
Standard Deviation/Root Mean Square Deviation: It was introduced by Karl Pearson in 1823. 
Coefficient of Variation (CV): It was given by Karl Pearson.
CV = 𝝈 / 𝑴𝒆𝒂𝒏 ∗ 𝟏𝟎𝟎
Higher the value, poorer the series.
Variance: The concept of variance was given by R.A. Fischer in 1913. It is simply the square of 
standard deviation Graphical Method of measuring dispersion is the 
Lorenz Curve
Graphical Method of measuring dispersion
The concept was given by Musgrave through Lorenz Curve
Where, the straight line is considered to be the line of equality. Closer is the curve to the line 
of equality, less will be the dispersion.
SKEWNESS 
Skewness measures the direction in which the values are dispersed. It measures the degree of 
symmetry or asymmetry of the series.
Symmetric Distribution: A series is said to 
be symmetric when it has the same shape on 
both the sides of the centre. A symmetric 
distribution which has only one peak is known 
as a normal distribution.
Positively skewed distribution: A distribution 
is considered to be positively skewed when it 
has a long tail extended from its right. Under 
such kind of a distribution, mean is greater than
the median, and is prone to large shifts when 
the sample drawn is small and contains extreme 
values.
Negatively Skewed: A series is said to 
be negatively skewed when it has a 
long tail extending from its left.
Tests of Skewness
1. If mean = median = mode, then the series is symmetrical.
2. Shape of the curve also determines whether the series is skewed or not.
3. If Q3 – Median = Median – Q1 then the series is symmetrical.
4. For a series to be symmetrical, the sum of positive deviations from the median should be equal 
to the sum of negative deviations from the median.
5. In case of a symmetrical distribution, the frequencies are equally distributed at points of equal 
deviations from the mode.
MOMENTS
Moments can be calculated using three ways:
1. Central Moments or Moments about the actual mean (ΞΌ)
ΞΌr = ⅀(X-Mean)r/N
ΞΌ1 = ⅀(X-Mean)1/N = Zero
ΞΌ2 = ⅀(X-Mean)2/N = Variance
2. Non - Central Moments or raw moments (Using assumed mean) (ΞΌ’) 
It is of least utility and is used for conversion purposes only. 
ΞΌr’ = ⅀(X-A)r / N 
3. Moments about the origin or zero (v) 
vr = ⅀(X-0)r / N 
v1 = ⅀(X)1 / N = Mean 
ΞΌ2 and ΞΌ3 measures skewness. 
ΞΌ2 and ΞΌ4 measures kurtosis.
KURTOSIS 
Kurtosis determines the shape of the curve, that is, degree of flatness or peakedness of the curve. 
It refers to how the items in a distribution are concentrated in the Centre. 
Mesokurtic: A normal distribution is also known as mesokurtic. There is neither an excess or 
deficient items in the centre of a mesokurtic curve 
Platykurtic: The items in a platykurtic are scattered around the shoulder and are not 
concentrated around the centre or at the tails. 
Leptokurtic: The items are more concentrated in the centre in case of a leptokurtic curve. As a 
result of this, such curve has a sharp peak.
Measure of Kurtosis 
Ξ²2 = ΞΌ4 / ΞΌ22 
Ζ”2 = Ξ²2 – 3 
Ξ²2 = 3 Mesokurtic 
Ξ²2 > 3 Leptokurtic 
Ξ²2 < 3 Platykurtic 
CORRELATION
Correlation determines the relationship between two or more variables. It determines the degree 
and direction of relationship between two or more than two variables. 
Spurious Correlation is the one in which relationship cannot be determined. 
Types of Correlation 
Depending upon direction of change 
1. Positive correlation: Both the variables move in the same direction. 
2. Negative correlation: Both variables change in the opposite directions.
 Ratio of changes 
1. Linear correlation: It reflects the same ratio of change. 
2. Curvilinear: Ratio of change between the variables is not the same.
 Degree of relationship 
1. Multiple correlation: Correlation between more than two variables. 
2. Simple Correlation: Correlation between two variables. 
3. Partial Correlation: Relationship is studied between two variables keeping other variables 
constant. 
Properties of Correlation (‘r’)
1. Its value lies between -1 and +1, that is, -1 ≤ r ≤ +1
2. r is independent of change in origin and change in scale.
3. r is the under-root of the product of two regression coefficients.
4. It is symmetric, that is, rxy = ryx
5. If the two variables are not related or are independent of each other, then r is equal to zero. of 
each other.
Probable Error: The probable error is used to test the value and significance of the correlation
coefficient. It helps in testing the reliability of r. we can also find out the exact lower and upper
limits within which r is expected to lie.
Methods for Measuring Correlation
1. Graphical or Scattered Diagram Method
2. Karl Pearson’s Coefficient of Correlation: It is based on the idea of covariance. 
3. Spearman’s rank correlation method: The method was developed in 1904. It is useful for 
finding the correlation in case of qualitative data. Under this, no assumption is made regarding 
the distribution of data. 
Properties: -
1. ⅀D = ⅀(R1 – R2) = 0 
2. R is distribution free and non-parametric. 
3. R=r when all values are different and no value is repeated.


QUESTIONS FOR CLARIFICATION

1. Which one of the following statistical measures is not affected by extremely Large or small 
values
A. Median B. Harmonic mean
C. Standard deviation D. Coefficient of variation
2. Which of the following is not a characteristic of a good average
A. It should be easy and simple to understand
B. It should have a sampling stability
C. It should not be rigidly defined
D. It should be based on all the observations
3. In any set of numbers, the geometric mean exists only when all numbers are
A. Negative B. Positive, zero or negative
C. Positive D. Zero
4. Which of the following is true about the arithmetic mean
A. It possesses a large number of characteristics of a good average
B. It is unduly affected by the presence of extreme values
C. In extremely asymmetrical distribution it is not a good measure of central tendency
Codes:
A. 1 only B. 1,2 and 3
C. 2 and 3 only D. 3 only 
5. Assertion (A) : skewness measures Regression
Reason (R): Kurtosis measures flatness at the top of the frequency curve.
A. Both A and B are true and R is the correct explanation of A
B. Both A and R are true but R is not the correct explanation of A
C. A is false but R is true
D. A is true but R is false
6. Consider the following statements
1.Quartile deviation is more instructive range as it discards the dispersion of extreme items
2.Coefficient of quartile deviation cannot be used to compare the degree of variation in different 
distributions
3.There are 10 deciles for a series
Codes:
A.1,2 and 3 B.2 only
C.3 only D.1 only

Answers:-
1) A 2) D 3) C 4) B 5) C 6) D


REGRESSION 

 from correlation because values of dependent variable is calculated on the basis of 
given values of explanatory variables.
Standard Error of Regression Estimate: It measures the dispersion around the average or is 
commonly known as the mean relationship. It measures the reliability of regression 
coefficient. 
Coefficient of Determination: It is measured using the following formula: 
Coefficient of Determination (r2) = 𝐄𝐱𝐩π₯𝐚𝐒𝐧𝐞𝐝 π•πšπ«π’πšπ­π’π¨π§ / π“π¨π­πšπ₯ π•πšπ«π’πšπ­π’π¨π§
Correlation coefficient (r) will be equal to r2 only when r = 0 or 1. Also, r tells the direction of 
correlation while r2 doesn’t provide us with any such information. 
Coefficient of Non-Determination (k2) = 1 – r2 or 
k2= π”π§πžπ±π©π₯𝐚𝐒𝐧𝐞𝐝 π•πšπ«π’πšπ­π’π¨π§ / π“π¨π­πšπ₯ π•πšπ«π’πšπ­π’π¨π§
TYPES OF DATA 
1. Primary Data: This includes Direct personal investigation , Indirect/Telephonic oral 
investigation, Information through local sources or correspondents, Schedule filled in by the 
respondents, Schedule filled in by the enumerators. 
2. Secondary Data: It is the second hand information and is not costly.
Census
Census covers each item of the population and is considered to be very authentic and 
reliable. However, it is time consuming and costly. 
Sampling 
 Law of Statistical Regularity: Any sample out of the population will have the properties of 
the population. 
Inertia of large numbers: Larger the sample size, more authentic is the sample. 
Sampling Methods 
1. Probability/Random/Chance Sampling: All units of the population have the same chance of 
being selected in the sample. 
Simple Unrestricted Probability Sampling: Under this method, the sample is drawn randomly, 
that is, each item has equal probability of getting selected in the sample. 
Stratified Sampling: Under stratified sampling, the population is divided into different strata. 
Strata are defined in such a way that the population inside one strata is homogenous in nature. 
Later, the sample is drawn from each strata, hence forming our complete sample.
Systematic Sampling: Under this, the population is arranged in an order. A sample is so 
constituted that every nth term is a part of the sample. For example, I choose every 7th individual 
to be a part of my sample.
Multi-Stage/Cluster sampling: Under this method, the sample is selected in stages, starting from 
an elementary stage.
2. Non-Probability Sampling: It is non-random. All items of the population don’t have the same 
chance of being selected.
Types:
Judgement Sampling: It is also known as ‘Sampling by Opinion’. Under this, the person, 
according to his own judgement, decides on who is to be included in the sample.
Quota Sampling: Under this method, a quota is decided on to how much should be the sample. 
Further, each investigator is given a quota to interview a specific number of respondents.
Convenience Sampling: Under this, the sample is formulated according to the convenience of 
the investigator
Sampling Errors
Error is the difference between sample statistic and population parameters.
Types of errors are:
1. Sampling Error: It can occur because of any procedure involved in sampling.
Biased: Due to human element
Unbiased: Due to problem in procedure of selection. 
 With increase in sample size, the sampling errors (especially the unbiased errors) decrease. 
2. Non-Sampling Error: It occurs after the selection of a sample. For example, faulty printing, 
coding, tabulation of data, etc.
PROBABILITY DISTRIBUTION
Observed Frequency Distribution: Based on observations
Theoretical Probability Distribution 1. Binomial
2. Normal
3. Poisson
Binomial Distribution: It is given by James Bernoulli and thus is also known as Bernoulli 
distribution. It is a kind of a discrete possibility distribution where we have only two cases: either 
a success or a failure.
Graphically, binomial distribution will be symmetrical when p = ½. If p > ½ then the 
distribution is positively skewed and when p < ½ then it is negatively skewed. 
Skewness of binomial distribution will be less pronounced if ‘n’ is increased. If we increase p 
for a fixed n, then binomial distribution will shift to the right. Mode, in this case, is the value of 
n which has the highest probability.
Binomial Distribution is also called a limiting case of normal distribution.
Mean = n*p 
Variance = npq
 Binomial distribution is usually used in business and social sciences as well as for quality 
control.
Poisson distribution: It was given by Denis Poisson in 1937. It is used when the number of trials 
are infinite but probability of success is very small and probability of failure is tending towards one.
It is used in case of finding the number of defects, number of accidents, number of causalities, 
etc. 
Mean = np
Variance= np (Mean = Variance)
Normal Distribution: It was given by A.D. Moivre, Karl Gauss and Laplace in 1733. 
Properties:
1. The distribution is symmetrical, hence it resembles a bell-shaped curve. 
2. The normal curve is symmetrical and is unimodal. 
3. The two tails of the curve do not touch the axis, that is, they continue to extend indefinitely. 
4. Total area under the normal curve is 1. 5. Mean = Median = Mode 
6. Standard Normal Distribution is the random variable which has a normal distribution with 
mean equal to zero and standard deviation equal to one. 
 Poisson distribution is a limiting case of normal distribution if m is large.
Mean Zero 
Standard Deviation One
Variance One
INDEX NUMBERS 
 First Price Index was given in Italy in 1754. It was used to compare price changes in the time 
period 1750 to 1760. 
1. Simple Index Numbers: It is an Unweighted Index Number. 
Simple Aggregative Method = ⅀π‘·πŸ / ⅀π‘·πŸŽ ∗ 𝟏𝟎𝟎
Price Relatives: Firstly the price relatives are calculated using the formula: 
Price Relatives = π‘·πŸ/ π‘·πŸŽ ∗ 𝟏𝟎𝟎
Then a simple average of price relatives is calculated, that is, 
(⅀Price relatives)/n
Laspeyer’s Index (L) P01 = (⅀P1Q0/⅀P0Q0 )*100 
 Under this, the base year quantities are taken as weights. The index value has an upwards bias 
as it over- estimates the price changes. 
Paasche’s Index (P) P01 = (⅀P1Q1/⅀P0Q1 )*100 
 Under this, the current year quantities are taken as weights. It underestimates the price 
changes and as a result has a downward bias. 
Dorbish-Bowley Index (L+P)/2 
 It uses both the current and base year quantities as weights. It is the arithmetic mean of both 
Laspeyer’s and Paasche’s index. 
Fischer’s Ideal Index √𝐿 ∗ 𝑃
It is the geometric mean of both Laspeyer and Paasche’s Index. It is considered to be an ideal 
index because: 1. Both current year and base year quantities are taken. 
2. It cancels the upward and downward bias. 
3. It satisfies the time reversal and factor reversal test.
Marshall-Edgeworth Index P01 = [( ⅀P1(Q0+Q1))/( ⅀P1(Q0+Q1)) *100 
 It is the sum of current and base year quantities as weights. 
Kelly’s Index P01 = (⅀P1q/⅀P0q) *100 
 It is also known as fixed weight aggregative method. 
Tests of Adequacy of Index Numbers 
Unit Test 
Formula for index number construction should be independent of the units in which prices and 
quantities are quoted. 
Apart from simple unweighted aggregative index, this test is satisfied by all other indices. 
Time Reversal Test P01 * P10 = 1 
It was given by Fischer. With the basis reversed, the two indices should be reciprocals of each 
other, that is, interchanging the time periods should not lead to inconsistent results. It is satisfied 
by; 1. Fischer’s index 2. Marshall-Edgeworth index 
3. Simple Geometric mean of price relatives 4. Aggregates with fixed weights. 
5. Weighted geometric mean
Factor Reversal Test P01 * Q10 = V01 : Given by Fischer. 
On interchanging the prices and quantities, the results should be consistent. 
This test will be satisfied by Fischer’s Index 
Circular Test P01* P12 * P20 = 1 
It is an extension of time reversal test and also the shiftability of weights. The index should be 
able to adjust index values from period to period without referring to the original base. 
This test is satisfied using 1. Simple Geometric mean of price relatives 
2. Weighted aggregative with fixed weights. 
Splicing of Index Numbers: Replacing the old series with the new series. 
 Deflating of Index Numbers: Accounting for price changes. 
TIME SERIES 
Time series analysis that over a period of time, how does the value of the variable changes. It 
forecasts future achievements.
Components of Time Series are: 
1. Secular Trend (T): Persisting movement of any variable over a period of time. 
2. Seasonal Variations (S): Repetitive movements in every season. 
3. Cyclical Variation (C): Business Cycles 
4. Irregular Variations (I): Completely random 
STATISTICAL INFERENCE
Statistical Inference is that branch of statistics where we use probabilities to deal with 
uncertainties in decision making. It involves: 1. Hypothesis Testing
2. Estimation
Hypothesis: It is a general statement made about any relationship. In other words, it is a tentative
assumption made about any relationship/population parameter.
Null Hypothesis (H0): Null hypothesis is stated for the purpose of testing or verifying its validity.
It assumes that there is no difference between population parameter and sample statistics and if
there is any difference, it is by chance.
Alternate Hypothesis (H1): It includes any other admissible hypothesis, other than null 
hypothesis. Alternate hypothesis is accepted when the null hypothesis is rejected.
Level of Significance: It is the probability of committing Type I error.
Power of a test: It analysis how well the test is working and depends majorly on Type II error. 
Power of a test can be measured by 1-Ξ².
Two Tailed Test: In this, the critical region lies on both sides. It does not tell us whether the value 
is less than or greater than the desired value. The rejection region under this test is taken on both 
the sides of the distribution. For example,
H0: Ξ² = 100
H1: Ξ²≠ 100
One Tailed Test: Under this, H1 can either be greater than or less than the desired value. The 
rejection region, under this test, is taken only on one side of the distribution.
H0: Ξ² = 100 
H1: Ξ² > 100
Any value which is based on the sample data is known to be a sample statistic and the value
calculated from the population is known as population parameter.
Standard error of sampling distribution tells us the standard deviation of a sampling distribution.
Standard error of a sample = Οƒ / √𝒏
Steps of Testing Hypothesis
1. Set your hypothesis.
2. Choose a level of significance
3. Choose a test
4. Make calculations or computations using the test.
5. Decision making stage.
Estimators
There are two kinds of estimators
1. Point estimators
2. Interval estimators: They are the confidence intervals in which lower and upper value of the
parameter will lie.
Properties of estimators
1. Unbiasedness
2. Consistency 
3. Efficiency 
4. Sufficiency: should include maximum information about the population parameter. 
Testing of Attributes 
 To test the attributes, we calculate the standard error, and the difference between the actual 
and the observed values. If the difference is greater than the standard error at a particular 
level of significance, we reject the null hypothesis. If the difference is less than the standard 
error, we accept the hypothesis.
T-Test: It was given by William Gosset in 1905 and is also known as Student’s T-test.
Assumptions 
a. Number of observations is less than 30. 
b. Random sampling distribution is approximately normal. 
c. It is used when standard deviation of population is not known. 
 If calculated value is greater than the table value, then difference is significant and we reject 
the null hypothesis. If calculated value is less than the table value, then we accept the null 
hypothesis implying that there is no significant difference. 
Properties of t-distribution 
1. The distribution is lower at the mean and flatter across the axis. T-distribution has greater area 
under its tails than the normal distribution. 
2. It is not symmetrical, that is, variance is greater than one. To make it symmetrical, we can 
increase the degrees of freedom which would lead the t-distribution towards normal distribution.
Z-test: The test was given by Fischer and is used when population standard deviation is known. 
The test is used when we need to identify whether the two samples are from the same population 
or not. 
Assumptions: 
1. Sample size is large, that is, n > 30. 
2. Population variance is known. 
3. Population is normally distributed. 
Applications of Z-Test 
 Z-test is used to compare the sample mean to a hypothesized mean of the population in case 
of large samples. 
 It is also used to test the difference between the mean of two samples, assuming that they 
have been drawn from the same population. 
 It is also used to test the difference between the two sample means, when the sample drawn 
is from two different population.
Uses of Z-Test are: 
1. To test the significance of ‘r’
2. To test the significance of the difference between two independent correlation coefficient 
derived from different samples. 
F-test: The test was given by Fischer in 1920s and is closely related with ANOVA. It is also known 
as Variance Ratio Test. 
Assumptions: 
1. Normality: Normal Distribution 
2. Homogeneity: Variance in each group should be equal for all groups. 
3. Independence of error: Variance of each value around its mean should be independent. 
 Used To find out whether two independent estimates and population variances differ 
significantly or whether two samples may be regarded as drawn from the normal 
population having the same variance. 
Chi-Square Test 
It is a non-parametric test and does not make any assumptions about population from which 
samples are drawn. It was first used by Karl Pearson in 1900. Ο‡2 = ⅀(O – E)2 / E 
Where: O is the observed value and E is the expected value.
Application of Chi-Square tests: 
1. It is used to test the discrepancies between the observed frequencies and the expected 
frequencies. 
2. It is also used to test the goodness of fit. 
3. It is used to determine the association between two or more attributes. 
Features of the test: 
1. It is a test of independence. 
2. It is a test of goodness of fit. 
3. It is a test of homogeneity, where two or more samples are drawn from same population or 
different population. 
4. Chi-Square distribution is skewed to the right and the skewness can be reduced by increasing 
the degrees of freedom. 
5. Value of Ο‡2 is always positive and upper limit is infinity. 
6. It applies Yates Correction (developed in 1934) to reduce the inflated difference between the 
observed and theoretical frequencies.


QUESTIONS FOR CLARIFICATION

1. Consider the following statements
The coefficient of correlation
1.Is not affected by a change of origin and scale
2.lies between -1 and +1
3.Is a relative measure of linear association between two or more variables
Codes 
A. 1,2 and 3 B. 1 and 3
C. 2 and 3 D. 1 only
2. Which of the following is referred to as lack of peakedness
A. Skewness B. Kurtosis
C. Moments D. Mode
3. Which of the following statements is true
1.When the margin of error is small, the confidence level is high
2.When the margin of error is small, the confidence level is low
3.A confidence interval is a type of point estimate
4.A population mean is an example of a point estimate
Codes:
A. 1 only B. 2 only
C. 4 only D. None of the above
4. In case of high income inequality ,the income distribution is
A. A symmetric distribution
B. U shaped distribution
C. Inverted J-Shaped distribution
D. None of the above
5. The basic construction of price index number involves which of the following steps
1.Selection of a base year and price of a group of commodities in that year
2.Prices of a group of commodities for the given year that are to be compared
3.Changes in the price of the given year are shown as percentage variation for the base year
4.Index number of price of a given year is denoted as100
Codes:
A. 1 ,2 and 3 B. 2,3 and 4
C. 1,3 and 4 D. 1,2 and 4

Answers:-
1) B 2) A 3) D 4) C 5) A


Share This
Previous Post
Next Post

Human-Omics is an initiative to help students to crack competitive exams with notes, mock tests and other educational aids and trade and earn profit