Getting started with linear regression using the ordinary least squares method

1 min read

You often want to find parameters that minimise some cost function. One approach could be using gradient descent. In this article, we will be talking about another approach called the normal equation.


  • Linear algebra basics
  • Vector basics

Getting started

Wow! One line. Also by using this method, you don't need to do feature scaling.

Let's verify dimensions

Octave implementation


How it works? (optional)

Please feel free to skip this section. This is quite maths heavy. Here is the outline of the mathematical proof. For brevity, we omit the full proof and only provide an outline.


  • Some of you may have noticed what if is non invertible (singular) - this is the reason we use pinv as it always gives us a value for
  • Need to compute which is slow if is very large since most implementations have time complexity ( is the dimensions)