
Online lesson from biological data analysis modules

What will you learn?

In this lesson you will learn to:

  1. Define a distribution
  2. Explain the concept of an empirical distribution
  3. Explain the concept of a theoretical distribution

What is a distribution?

A distribution decribes a variable by communicating two important pieces of information.

  1. It specifies all the possible values that a variable may take (these can be called the possible outcomes).
  2. It quantifies the relative frequency of each possible value (this can be called the probability of an outcome).

A distribution is very good at describing uncertainty because it can describe events that have multiple possible outcomes.

Below are four examples

Rolling a Dice

Rolling a dice is a classic example of qualitative data (each face of the dice is labelled).

There are six possible outcomes from rolling a dice.

For an unbiased dice each outcome is equally likely, meaning the distribution of outcomes is:

One Two Three Four Five Six

The probability of each outcome is a sixth.

Later, we'll see that this is an example of a categorical distribution.

Counting Cats

For one week I recorded the number of times I saw my neighbours' cats in my garden and their coat colour.

Here are my data as a frequency distribution.

Black Ginger Tabby Tortoiseshell

Their are four outcomes (the coat colours). The frequency distribution gives the number of times each outcome occurred (e.g. I saw the Tabby cat eight times).

Human Heights

Human height data is quantitative continuous: Every individual has a different height.

To display this continuous distribution we divide the x-axis (height) into bins (1 cm bins are used above) and count the number of data points within each bin (called FREQUENCY on the y-axis).


Below is the distribution of mean daily temperatures for January across Ireland. The data are from 1995-2016

This temperature data is quantitative continuous.

To display the distribution a bin width of 0.5 deg C has been used.

Histogram of mean daily temperature for January in Ireland

Empirical Distribution

Wolf Example

Below is the empirical distribution of cortisol concentrations measured from a sample of 103 wolves

empirical distribution of cortisol concentrations

This module uses these data on cortisol concentrations measured from hair samples of wolves in Canada (described here).

Theoretical Distributions

Normal Distribution

The Normal distribution is a bell shaped curved that has a well defined mathematical description.

An example of the normal distribution

Above is a Normal distribution (blue curve) being used to mimic human height data (described here)

Categorical Distribution

The categorical distribution describes the probabilities of a finite number of outcomes. The mathematical description of the categorical distribution is the probabilities for each outcome.

An example of a categorical distribution for a six sided dice

Above is a catagorical distribution for the outcomes of rolling an unbiased dice. Each outcome has a probability of 1/6.

Gamma Distribution

The Gamma distribution is a skewed theoretical distribution. It has a well defined mathematical description.

An example of five gamma distributions with differing amounts of skew

Above are five Gamma distributions with differing amounts of (right) skew.

Log-normal Distribution

The log-normal distribution is another skewed theoretical distribution. It has a well defined mathematical description.

An example of a log-normal distribution

Above are four log-normal distributions with differing amounts of (right) skew.

Binomial Distribution

The binomial distribution describes the number of successes and failures from repeatedly performing a task with a constant probability of success. It has a well defined mathematical description.

An example of a binomial distribution

Above is the binomial distribution for tossing a coin 10 times. The outcomes are the number of heads (ranging from zero to ten) and the distribution gives the probability of each outcome.

Chi-squared Distribution

The Chi-squared distribution describes the distribution of outcomes from squaring values from a normal distribution and then adding them up. It has a well defined mathematical description.

An example of a chi-squared distribution

Above is the distribution of outcomes by taking three values drawn from a normal distribuiton (with mean of zero and standard deviation of one), squaring each and adding up the results. This is a Chi-squared with three degrees of freedom.

Poisson Distribution

The Poisson distribution describes a quantitative discrete variable and is related to the binomial distribution. It has a well defined mathematical description.

An example of a Poisson distribution

Above is the Poisson distribution for describing the number of times a person will be hit by lightning in their lifetime, assuming the probability that a person is struck by lightning in their lifetime is about 15,000 to one.

Wolf Example

The Gamma or log-normal distributions could both mimic the empirical distribution of the wolf cortisol data

The empirical distribution of cortisol concentration from a sample of wolves and two theoretical distributions that model these empirical data.

Above is the empirical distribution of cortisol (grey bars), a Gamma distribution (red) and a log-normal distribution (blue).

Distribution Shapes

A distribution can be broadly described by its shape.

Below are some words used to describe a distribution's shape

Some words describing the shapes of a distribution


A symmetrical distribution looks identical when it is reflected around its centre

Below is a Normal distribution with mean=12 and standard deviation=5

Example of a symmetric distribution

The Normal distribution is symmetrical about its mean

Skewed (1)

A skewed distribution is asymmetrical.

Below right is a Gamma distribution with mean=12 and standard deviation=5. The symmetrical Normal distribution is shown in grey.

An example of a left-skewed distribution An example of a right-skewed distribution

Right-skew (positive skew) has a distribution with an extended tail on the right

Left-skew (negative skew) is the opposite of right-skew

Right-skew (left-skew) commonly causes the mean to be larger (smaller) than the median, because the mean is influenced by extreme values.

Skewed (2)

An over-dispersed distribution has an excess of extreme values (i.e. it has fat tails).

Below left is a t-distribution distribution, shifted to have mean=12 and standard deviation=5. The equivalent Normal distribution (no over-dispersion) is shown in grey. On the right is a zoom into the tail of the distribution showing the 'fat-tail'

An example of an over-dispersed distribution A zoom of the tail from an over-dispersed distribution

Over-dispersed is also known as platykurtic (platy- means broad)


An under-dispersed distribution has a deficit of extreme values (i.e. it has thin tails).

Below is a Uniform distribution with mean=12 and standard deviation=5. The equivalent Normal distribution (no under-dispersion) is shown in grey.

An example of an under-dispersed distribution

Under-dispersed is also known as leptokurtic (lepto- means slender)

Have a go ...

The section at the start called 'What is a distribution?' gives four examples of distributions (dice, cats, heights and temperatures).

For each of these four examples:

  1. Specify whether the example is of an empirical or a theoretical distribution
  2. Describe the shape of the distribution using the terms discussed in this lesson (symmetric, right-skewed, ...)

Key Points

  • The distribution of a variable describes the relative frequency of outcomes
  • Data in a sample has an empirical distribution
  • No two samples have the same empirical distribution
  • A theoretical distribution has a precise mathematical description
  • A theoretical distribution has parameters that modify the shape of the distribution
  • Theoretical distributions can be used to 'mimic' empirical distributions
  • Empirical and theoretical distributions can be described by their shape