S1.5StatisticsCore

The Large Data Set

Edexcel provides a real weather Large Data Set (LDS) that exam questions draw on. You will not memorise the numbers, but you should know its structure, its quirks, and how to handle missing or anomalous values.

20 min Video by Zeeshan Zamurred Data Collection
Edexcel AS Level Maths: The Large Data Set — Data Collection (Part 2)Watch the full walkthrough before the notes below.
Open on YouTube

What you'll be able to do

  • Understand what the Large Data Set is
  • Know the kinds of variables it contains
  • Handle missing data and anomalies
  • Use familiarity with the LDS in exam questions
1

What the LDS is

The Edexcel Large Data Set is real from UK and overseas weather stations across set time periods. Exam questions assume you have worked with it and understand its layout and units.

2

Variables and units

It contains both quantitative variables (temperature, rainfall, wind speed, pressure) and qualitative ones (location). Knowing the units and typical ranges helps you spot impossible or anomalous values.

Tip — Know the units (e.g. wind speed in knots, rainfall in mm) — exam questions test whether a value is realistic.

3

Missing data and anomalies

Some entries are blank (data not recorded — often shown as "n/a" or "tr" for trace rainfall). These are not zeros and should not be treated as such. Anomalies (clearly wrong readings) may be identified and excluded.

Blank/coded entries must not be read as zero.

Formula recap

What it is.
Blanks are not zeros.
For spotting anomalies.

Common mistakes to avoid

Treating missing/blank entries as zero.
Missing data means “not recorded”, which is different from a value of 0.
Ignoring units when judging whether a value is sensible.
Knowing the units (and typical ranges) lets you spot anomalies.

Key takeaways

  • The LDS is real weather data Edexcel provides for exams.
  • It mixes quantitative (temperature, rainfall…) and qualitative (location) variables.
  • Missing entries are not zeros; know units and spot anomalies.

Test yourself

Ready to lock in The Large Data Set? Pick a mode and earn XP & Dobloons.