# Stats & Math path.

Module: Statistics

### Statistical Sampling

When you’re conducting any kind of analysis, you need data. That data could be the results of a survey, for example. If we imagine that we are surveying every student in a UK university, you would need to distribute, collect and analyse 15,000+ surveys to students. It’s very time consuming, costly and quite impractical. Click here to read more.

### Experimental Design

First, we need to state our hypothesis. For example, we may say ‘Air Pollution Causes Asthma in kids in urban settings’. This is the hypothesis that we wish to prove correct or incorrect as part of our statistical study. Click here to read more.

### Measures of Central Tendency

In this article, we’re going to cover the basic measures of central tendency. You’re probably familiar with mean, median and mode already, but it is work re-visiting them before continuing through our statistics articles. Click here to read more.

### Measures of variance

Variance is a key statistical topic. It lets us know how close to the mean the data is clustered. Consider, if 2 classes have a mean exam score of 40%; does that mean that all the students in both classes performed equally as well/badly? Click here to read more.

### Percentiles, Quartiles & Box-Plots

Percentiles are 100 equal groups into which a population can be divided according to the distribution of values. A percentile can be between 1 and 99 – whatever number you pick, X% should fall below that number. For example, if you’re in the 60th percentile, you should be greater than 60% of all other observations. Click here to read more.

### Correlation

Tech target defines correlation as: “a statistical measure that indicates the extent to which two or more variables fluctuate together”. There are two major components to correlation: strength and type of correlation.Click here to read more.

### Linear Regression

Linear regression provides a rule which enables us to make predictions of Y based on the X. Effectively, it fits a line of best fit to a scatter plot, where the sum of the squares (the space between the datapoints & the line is least). Click here to read more.

### Normal Distribution & The Empirical Rule

A normal distribution looks a little bit like the below. It’s where the mean, median and mode are on top of one another. Click here to read more.

### Standard Normal Distributions

We’ve discussed normal distributions in previous articles & today, we’re going to talk about standard normal distributions. By standardising distributions, we can compare two different distributions directly. Remember, a normal distribution has no skew; is symmetrical and has the mean at the highest point. Click here to read more.

### Z-Scores & Probability

When we look at a standard normal distribution, we see something like the below. Here, we have the mean at the highest point, the mean, median and mode on top of one another and we have our z-scores along the bottom. Having a standard normal distribution helps us to compare two different normal distributions, which have different means & standard deviations.Click here to read more.

### Confidence Intervals

A confidence intervals defines a range of values that we’re fairly sure our true value lies in. Confidence intervals enable us to mitigate the impact of sampling error (where the sample mean is not equal to the true mean and each sample mean is different). Click here to read more.

### Null Hypothesis

Hypothesis testing enables us to validate or test whether a given hypothesis is true. For example, let’s say that we have a engine component designed for a Formula One car. Click here to read more.

Module: Linear Algebra

### Matrices & Vectors

A vector is effectively a single column table (an array). When we use it in data science, that column will store a row of data in a traditional table. For example, all the details about one customer. We would use this notation to show that X is a vector:Click here to read more.