πŸ“¦
MathπŸŽ“ Ages 14-18Advanced 10 min read

Grouped Data and Class Intervals

Organise large data sets into class intervals, estimate the mean using midpoints, find the modal and median classes, and read a frequency table, with worked examples.

Key takeaways

  • Grouping data into class intervals tames large data sets but loses exact values
  • Estimate the mean using midpoints: total of (midpoint Γ— frequency) Γ· total frequency
  • The modal class has the highest frequency; the median class contains the middle value

When there is too much data

Imagine recording the height of 200 students. A frequency table listing every single height β€” 161 cm, 161.5 cm, 162 cm β€” would be enormous and hard to read. Instead we sort the data into class intervals, such as 150–160 cm and 160–170 cm. This is grouped data: easier to handle and to chart, at the cost of losing the exact values.

Reading class intervals

A class interval is a range of values written with inequalities to avoid overlap. Consider the time, in minutes, that 30 students spent on homework.

Time, t (minutes)Frequency
0 ≀ t < 104
10 ≀ t < 2011
20 ≀ t < 309
30 ≀ t < 406

The notation 10 ≀ t < 20 means "10 or more, but less than 20". A student who spent exactly 20 minutes falls in the next class (20 ≀ t < 30), not this one. This careful boundary keeps every value in exactly one class. Total frequency = 4 + 11 + 9 + 6 = 30.

The modal class

Because we no longer have individual values, we cannot give a single mode. Instead we give the modal class β€” the interval with the highest frequency.

The biggest frequency is 11, so the modal class is 10 ≀ t < 20 minutes.

Estimating the mean with midpoints

We cannot find the exact mean, because we do not know each student's precise time. The standard method is to assume every value sits at the midpoint of its class. The midpoint is halfway between the class bounds.

Time, tMidpoint (x)Frequency (f)f Γ— x
0 ≀ t < 105420
10 ≀ t < 201511165
20 ≀ t < 30259225
30 ≀ t < 40356210
Total30620

Now apply the formula:

Estimated mean = total of (f Γ— x) Γ· total frequency

Estimated mean = 620 Γ· 30 = 20.7 minutes (to 1 decimal place).

This is an estimate, because the midpoint assumption is rarely exactly true. We always write "estimated mean" for grouped data, never just "mean".

The median class

With 30 values, the median sits around the 15th–16th value. We find which class it falls into using a cumulative frequency (a running total).

Time, tFrequencyCumulative frequency
0 ≀ t < 1044
10 ≀ t < 201115
20 ≀ t < 30924
30 ≀ t < 40630
  • The first 4 values are in class 1.
  • By the end of class 2 we have reached the 15th value.
  • The 16th value falls in the next class.

The 15th value lands at the top of the second class and the 16th in the third, so the median lies right at the boundary β€” the median class is best given as 20 ≀ t < 30 minutes (where the middle of the data sits once we pass position 15). When asked only for the median class, identify the interval containing the middle position.

Activity: measure and group

  1. Collect 20–30 values, such as the length of each classmate's shoe in centimetres, or reaction times.
  2. Choose sensible equal class intervals (for example widths of 2 cm) and tally each value into one class.
  3. Identify the modal class.
  4. Build midpoint and f Γ— x columns and calculate the estimated mean.
  5. Use cumulative frequency to find the median class, and discuss why your mean is only an estimate.

Why this matters

Grouped data is how real surveys, censuses and scientific studies summarise thousands of measurements, and the midpoint method is the standard way to estimate an average from them. The trade-off β€” convenience for a small loss of accuracy β€” is a key idea in statistics. Strengthen the underlying skill with averages from a frequency table, and revisit the core averages in mean, median, mode and range.

Quick quiz

Test yourself and earn XP

For the class interval 10 ≀ x < 20, what midpoint do you use to estimate the mean?

Why is the mean from grouped data only an ESTIMATE?

Which is the modal class if frequencies are: 0–10 has 4, 10–20 has 11, 20–30 has 6?

With 30 data values, which position do you locate to find the median class?

FAQ

Grouped data is data sorted into class intervals such as 0–10 and 10–20, rather than listing every individual value. It makes large data sets manageable.

Once data is grouped, the individual values are lost. We assume each value sits at the midpoint of its class, so the mean we calculate is an estimate.