Sample and Population

Online lesson from biological data analysis modules

What will you learn?

In this lesson you will learn to:

  1. Describe the data analysis meaning of "a population"
  2. Describe the data analysis meaning of "a sample"
  3. Explain how a sample can be used to infer properties of a population

What is a population?

(Video 36 sec)

Example (1)

A population can be something tangible.

For example:

  • Question: what is the average height of adults in the USA?
  • Population: the number of adults currently in the USA

Note: we must clearly define what we mean by an adult (e.g. all individuals older than 20 years)

Example (2)

A population can also be more abstract.

For example:

  • Question: what is the half-life of uranium 238?
  • Population: all atoms of uranium 238 that exist

In this example it is impossible to collect data from the entire population.

Example (3)

Collecting data from the entire population may destroy the population.

For example:

  • Question: what is the biomass productivity during May of an undisturbed natural grassland from a 1 m2 experimental plot?
  • Population: the biomass from a 1 m2 experimental plot for May

In this example, as soon as we collect the data the grassland will no longer be undisturbed.

Have a go...

Describe one possible population that would be appropriate for the following questions. There may be several possibilities for each question.

  • What is the average yield of wheat (per hectare) grown using an new brand of fertiliser (brand X)?
  • What is the yearly survival probability of an adult Eurasian blackbird (Turdus merula?
  • What is the present concentration of CO2 in the Earth's atmosphere?
  • What is the average top running speed of a cheetah (Acinonyx jubatus)?
  • By how much does the lac operon in Escherichia coli become upregulated in the presence of lactose?

What is a sample?

(Video 1 min 14 sec)

Example (1)

A sample is always something tangible because it represents data that has been collected from the population.

For example:

Example (2)

Another example of a sample:

  • Question: What is the concentration of red blood cells in patient X (we'll call the patient Bob)?
  • Sample: A sample of blood collected from Bob at one point in time.

For this question the population is Bob's entire blood volume. For ethical and logistical reasons the sample is smaller than the population.

Have a go...

Suggest samples that could be appropriate to provide an approximate answer to following questions.

  • What is the average yield of wheat (per hectare) grown using an new brand of fertiliser (brand X)?
  • What is the yearly survival probability of an adult Eurasian blackbird (Turdus merula)?
  • What is the present concentration of CO2 in the Earth's atmosphere?
  • What is the average top running speed of a cheetah (Acinonyx jubatus)?
  • By how much does the expression of the LacZ gene in Escherichia coli (a component of the lac operon) increase in the presence of lactose?

Population Inference

(Video 1 min 24 sec)

Population inference (2)

  • Using a sample to estimate properties of a population always incorporates some level of uncertainty in the estimate.
  • It is important to quantify the uncertainty in an estimate, as well as quantifying the estimate itself
  • In general, as the size of a sample increases the uncertainty in an estimate decreases

Key Points

  • A population relates to the question being asked
  • A sample is the data collected from the population
  • We rarely have data for an entire population
  • A sample gives partial information about the population
  • Estimating properties of a population from a sample is called population inference