Central Tendency

In this article, we’re going to cover the basic measures of central tendency. You’re probably familiar with mean, median and mode already, but it is work re-visiting them before continuing through our statistics articles.

The mode, is the number that occurs the most frequently. As you can see below, it is possible for some datasets to have no mode and it’s also possible for a dataset to have more than one mode.

The mode is not very stable or resistant. That means, a single number change in the dataset can remove or alter the mode significantly. The mode is not very widely used, as it doesn’t tell us anything particularly useful.

The median is much more useful than the mode. It tells us what the middle of the data is. As below, you can see that when there are an odd number of datapoints, the mode is equal to the point exactly in the centre. While if there are an even number of datapoints, we need to take the two most central items, add them together & divide them by 2.

The median tells us the 50th percentile of the data. That means, 50% of datapoints sit below the median and 50% above.

The median is quite resistant / stable. A few outliers do not drastically change the median.

The mean is very widely used and is often referred to simply as the average. The below characters are important to know when working with the mean.

The problem is, the mean is not resistant to outliers. If we get one outlier, the mean can be skewed quite heavily. As such, we can use a trimmed mean & a weighted average.

A trimmed mean is where we order all of the data and remove the top 5% and bottom 5% of all datapoints. This should remove the outliers. We would call this the 5% trimmed mean.

We could also use the weighted average. We do this when some datapoints are more valuable than others. For example, in the below table, the student got an A in their coursework, but their coursework is only worth 20% of their grade and the quiz is only worth 10%. As such, we can weight the scores.

Note: in a normal distribution, the mean, median and mode will be very close to one another.