Cricket averages, exam marks, rainfall figures — almost every number you read in the news is summarised using Statistics. For the NDA exam this is one of the friendliest chapters: the questions are short, formula-driven and rarely tricky. Master a handful of measures of central tendency and dispersion, and you can comfortably bag 2–4 sure-shot marks every year.
Why Statistics Is Easy Marks
Every NDA Maths paper carries questions from Statistics, and they tend to be direct and quick. Unlike calculus, you do not have to manipulate long expressions — you simply organise the data, plug numbers into a formula, and read off the answer. For a student racing through a 150-question paper, that speed is precious.
Statistics also overlaps neatly with Probability, which sits right next to it in the syllabus, and with everyday reasoning you already use. Once you are comfortable averaging marks or judging how scattered a set of scores is, most of this chapter will feel like common sense dressed up in symbols. That is exactly why Cavalier teachers treat it as a confidence-building topic early in the course.
The whole chapter rests on two big ideas: a measure of central tendency (where is the centre of the data?) and a measure of dispersion (how spread out is the data?). Learn the formulas for each, know when to use which, and you have mastered the topic. This lesson walks you through all of them with the kind of small, clear steps you can reproduce in the exam hall.
First decide whether the question is about the centre of the data or its spread. Picking the right family of formula is half the battle; the arithmetic that follows is easy.
Raw, Discrete and Grouped Data
Before any formula, identify what kind of data you have, because the same measure is computed slightly differently in each case.
- Raw (ungrouped) data: a plain list of values, like 5, 8, 6, 9, 7.
- Discrete frequency data: distinct values x with how often each occurs, the frequency f.
- Grouped (continuous) data: values bunched into class intervals such as 0–10, 10–20, each with a frequency.
For grouped data we use the class mark (the midpoint of a class) as the representative value of that class. The class mark equals (lower limit + upper limit) ÷ 2.
N (or Σf) always means the total frequency — the total number of observations. Many slips happen because students forget to add up all the frequencies first.
Arithmetic Mean (Average)
The mean is the most familiar measure of central tendency: add everything up and divide by how many there are.
Raw data: Mean = (Σxi) ÷ n
Frequency data: Mean = (Σfixi) ÷ (Σfi)
Grouped data: same formula, with xi as the class mark.
The mean uses every single value, which is both its strength and its weakness. Because it accounts for all data, it is the most informative average. But because it accounts for all data, a single freak value — an outlier — can drag it badly off-centre. One billionaire walking into a room makes the average wealth of that room misleadingly huge.
For large grouped data, use the assumed-mean (step-deviation) method: pick an assumed mean a, let di = xi − a, and Mean = a + (Σfidi) ÷ (Σfi). It cuts the arithmetic dramatically.
Median (The Middle Value)
The median is the value that splits ordered data into two equal halves — half the observations lie below it and half above. You must arrange the data in ascending order first.
If n is odd: median is the value of the (n + 1)/2 th term.
If n is even: median is the average of the (n/2)th and the (n/2 + 1)th terms.
Grouped data: Median = l + [(N/2 − cf) ÷ f] × h
In the grouped formula, l is the lower limit of the median class, cf is the cumulative frequency just before it, f is the frequency of the median class, and h is the class width. The median class is the one in which the N/2th observation falls.
Because the median only looks at position and not at the actual size of extreme values, it is unaffected by outliers. That is why average income and average house prices are usually reported as medians rather than means — they give a fairer picture of the typical case.
Mode (The Most Frequent Value)
The mode is simply the value that occurs most often. In the data 3, 5, 5, 6, 9 the mode is 5. A data set can have one mode, two modes (bimodal) or none at all.
Grouped data: Mode = l + [(f1 − f0) ÷ (2f1 − f0 − f2)] × h
Here f1 is the frequency of the modal class, f0 and f2 are the frequencies just before and after it, l is the lower limit and h the class width.
The modal class is the class interval with the highest frequency. The mode is the only average that can be used for non-numerical (categorical) data — for example, the most popular shoe size or the most common blood group.
There is a neat empirical relationship linking all three averages for a moderately skewed distribution: Mode = 3 × Median − 2 × Mean. NDA loves to test this directly.
Measures of Dispersion: Range and Mean Deviation
Two classes can have the same average marks yet be completely different — one full of similar scorers, the other a mix of toppers and weak students. Dispersion measures how spread out the data is around its centre.
The simplest measure is the range = largest value − smallest value. It is quick but crude, because it ignores everything between the two extremes.
Mean deviation about the mean = (Σ|xi − x̄|) ÷ n
Mean deviation about the median = (Σ|xi − M|) ÷ n
Mean deviation takes the absolute distance of each value from a central point, so positive and negative gaps do not cancel out. A useful fact the exam likes: the mean deviation is least when measured about the median, which makes the median the most economical centre to spread around.
Variance and Standard Deviation
The most important measures of spread in the NDA syllabus are variance and its square root, the standard deviation (SD). Instead of absolute distances, they use squared distances from the mean, which makes the maths far more workable.
Variance: σ2 = (Σ(xi − x̄)2) ÷ n
Standard deviation: σ = √(variance)
Shortcut form: σ2 = (Σxi2)/n − (x̄)2 — i.e. mean of squares minus square of the mean.
The shortcut form is the one to memorise, because it lets you compute variance without first finding each deviation. You only need two running totals, Σx and Σx2, and the count n.
The standard deviation carries the same units as the data (marks, kilograms, rupees), which makes it easy to interpret, while variance is in squared units. A small SD means the values huddle close to the mean; a large SD means they are widely scattered.
Do not confuse variance with standard deviation. Variance = σ2; standard deviation = σ. If a question asks for SD and you stop at variance, you lose the mark by forgetting the square root.
Coefficient of Variation
How do we compare the consistency of two groups measured in different units, or with very different averages — say the heights of soldiers versus their weights? Standard deviation alone cannot do this fairly, so we use a relative measure.
Coefficient of Variation (CV) = (σ ÷ x̄) × 100
It is a pure number (a percentage), free of units.
The series with the lower CV is more consistent (more stable, more uniform), while the one with the higher CV is more variable. This is a classic NDA question: two batsmen's scores are given and you are asked who is the more consistent player — the answer is always whoever has the smaller coefficient of variation.
Some standard properties: adding a constant to every value leaves the variance and SD unchanged, but multiplying every value by a constant k multiplies the SD by |k| and the variance by k2.
Worked Example: Mean and Standard Deviation
Let us solve a typical NDA-style numerical the way you should in the exam hall, using the shortcut formula for variance.
Find the mean and standard deviation of the data: 2, 4, 6, 8, 10.
Notice we never computed individual deviations — two totals (Σx and Σx2) were enough. That is the move examiners reward, because it is fast and leaves little room for sign errors.
Common Mistakes to Avoid
Most lost marks in this chapter come from careless handling of data, not difficult ideas.
- Forgetting to arrange data in order before finding the median.
- Dividing by the wrong total — use n for raw data but Σf for frequency data.
- Stopping at variance when the question asked for standard deviation (forgetting the square root).
- Using class limits instead of class marks as xi in grouped-data formulas.
In the empirical relation, the order matters: it is Mode = 3 Median − 2 Mean, not the other way round. Mixing up the coefficients is a frequent and costly slip.
Previous-Year Style Practice
Here is a question modelled on the NDA exam pattern. Try it before reading the solution.
Q. The mean of a distribution is 60 and its mode is 48. Using the empirical relationship, what is the value of the median?
Answer: The empirical relation is Mode = 3 Median − 2 Mean. So 48 = 3M − 2(60), giving 48 = 3M − 120, hence 3M = 168 and Median M = 56.
Notice how the single empirical formula linked all three averages, letting us find the unknown median in one line. Whenever a question gives you any two of mean, median and mode, reach straight for this relation.
Quick Revision Before the Exam
Glance over these the night before your paper and the morning of it.
- Mean = (Σfixi)/(Σfi); uses all data, sensitive to outliers.
- Median = middle value; unaffected by outliers; grouped formula uses l, cf, f, h.
- Mode = most frequent value; empirical: Mode = 3 Median − 2 Mean.
- Variance σ2 = (Σx2)/n − (x̄)2; SD σ = √variance.
- CV = (σ/x̄) × 100; lower CV means more consistent.
Practise 8–10 mixed numericals daily for a few days before the exam. Statistics rewards careful arithmetic and a steady hand far more than cleverness.
Frequently asked questions
What is the difference between mean, median and mode?
The mean is the arithmetic average of all values, the median is the middle value when data is arranged in order, and the mode is the value that occurs most often. The mean uses every observation, while the median and mode focus on position and frequency.
Why is the median preferred over the mean for skewed data?
The median is unaffected by extreme values, whereas a single very large or very small value can pull the mean far off centre. For incomes, house prices and other skewed data, the median gives a fairer picture of the typical value.
What is the relationship between variance and standard deviation?
Standard deviation is simply the positive square root of the variance. Variance is measured in squared units, while standard deviation shares the same units as the original data, which makes it easier to interpret.
How do I compare the consistency of two data sets?
Use the coefficient of variation, CV = (standard deviation / mean) times 100. The data set with the lower coefficient of variation is more consistent or stable, regardless of differences in units or averages.
What is the empirical relationship between mean, median and mode?
For a moderately skewed distribution, Mode = 3 times Median minus 2 times Mean. If any two of the three measures are known, this formula lets you quickly calculate the third, which is a frequent NDA exam shortcut.
Related NDA Maths topics
Want a teacher to walk you through NDA Maths?
Cavalier's NDA batches break every topic into classroom sessions with daily practice, tests and doubt-clearing.