S4.2StatisticsStretch

Linear Regression

The regression line is the “line of best fit” through bivariate data, written . It lets you predict the response from the explanatory variable — but only safely within the range of the data.

25 min Video by Zeeshan Zamurred Correlation
Edexcel AS Level Maths: Chapter 4 Correlation (Part 2)Watch the full walkthrough before the notes below.
Open on YouTube

What you'll be able to do

  • Understand the regression line y = a + bx
  • Interpret the gradient and intercept in context
  • Use the line to make predictions
  • Distinguish interpolation from extrapolation
1

The regression line

The least-squares regression line of on has the form , where is the gradient and the intercept. It is the line that best fits the data by minimising the vertical distances to the points.

= gradient (change in per unit ), = intercept.
2

Interpreting the coefficients

In context, is the predicted for each one-unit increase in the explanatory variable, and is the predicted response when . Always state units.

1 is the gradient.
Answereach extra £1000 of advertising predicts £2000 more sales
3

Interpolation vs extrapolation

Predicting the data range () is reliable. Predicting it () is unreliable, because the linear pattern may not continue.

Tip — Trust predictions inside the data range; be wary of extrapolating beyond it.

Formula recap

Regression line of y on x.
Gradient meaning.
Prediction validity.

Common mistakes to avoid

Trusting predictions far outside the data range.
Extrapolation is unreliable — the relationship may not hold there.
Using the line to predict x from y.
The y-on-x regression line is for predicting y from x, not the reverse.

Key takeaways

  • Regression line: y = a + bx (best fit).
  • b = change in y per unit x; a = y when x = 0.
  • Interpolation (within data) is reliable; extrapolation (beyond) is not.

Test yourself

Ready to lock in Linear Regression? Pick a mode and earn XP & Dobloons.