Matrices & Vectors

In this section, we will cover:

  1. Scalars, which are single datapoints (x) – e.g. 6
  2. Vectors
  3. Matrices

Vectors

A vector is effectively a single column table (an array).  When we use it in data science, that column will store a row of data in a traditional table. For example, all the details about one customer. We would use this notation to show that X is a vector:

In mathematics, we call something a real number if it is a continuous measure. So it can include -10; 1.24562, and even square roots we use the capital ‘R’ notation to show when something is a real number.

The vector notation can be extended to show that a vector is comprised of real numbers & how many features exist within the vector. 

A vector should be written vertically. However, that can be very confusing as it represents a row. As such, we draw it horizontally (as per the third example) and add the superscript T to say ‘Transpose’, meaning to – to swap rows and columns.

We write it like this:

We then have y being a function of the vector. This means, our predicted value is created from all observations within the vector. 

This can also be denoted as:

When we draw vectors, they will produce one line each on a chart. This allows us to compare multiple vectors together quite easily. The difference between the two lines is the difference between two vectors.  We can use product to make to compare one vector with another.  

Matrices

A matrix is multiple vectors, joined together into a more table like structure., it has multiple dimensions, unlike a vector. 

Here we have some matrix notation: 

First, we have X is a real number, with a defined number of rows and columns (shown in superscript)

Below, we have some matrix operations (addition, subtraction and multiplication).

Other interesting notation

In the below example, we have single values within a function. A & B in the below are the parameters – they will be constants – the slope & y intercept of a line.

In the below example, we have two vectors. One a vector of input values / features and one a vector of parameters.