Information collected via birth and death registers, national censuses and other sources – from insurance companies to online social networks – means that there is always a wealth of data available for studying human populations that isn’t available for other species. Studying human populations might be a case of locating this data that have already been collected and analysing them in the correct way. In this context, the information used is secondary data, because it has not been collected for the purposes of that particular study 

For example, scientists wanting to study driving behaviour across Germany used data from a transport survey, combined them with satellite data and plotted them on a map. They showed that people who lived further away from built-up urban areas were more likely to own a car and to drive further each week.

However, when data are collected for other reasons, it is difficult to firmly link cause and effect. In this type of study you would have to account for factors such as wealth and family size, rather than just assuming differences were due to distance from local amenities. These factors are sometimes called confounding factors.

In a medical trial it is important to eliminate such ‘confounding’ factors, which may influence both the risk of contracting and the outcome of the disease being studied.

Imagine researchers want to study the safety of a commonly prescribed blood pressure drug. They could look at death rates among people taking the drug as well as the death rates among all people taking blood pressure drugs over the last five years. But a new study may be preferable, so that the experimental design could control for confounding factors, such as other medical conditions and age.

