How do researchers tease apart the interactions of lifestyle and genes in the development of common diseases?
The causes of some diseases are straightforward. A relatively simple DNA test can immediately predict whether someone will develop cystic fibrosis, for example. But the most common chronic diseases – cancer, diabetes, heart disease, stroke and dementia – are far more complex. Genes do influence their risks, but so too do lifestyle factors such as diet and smoking habits, and the environments in which we live and work.
The problem for researchers studying such diseases is that many different factors may be involved, and any one factor may have only a small effect. To untangle this complexity, large numbers of people with each disease need to be studied in great detail – and this is where biobanks come in. By collecting DNA, medical and lifestyle data from hundreds of thousands of people and following their health long-term, researchers can work out why some people develop a particular disease and others do not.
Although the term ‘biobank’ is often applied to any database of health-related information and clinical samples (typically blood and urine) from individuals, the projects can be set up in markedly different ways. ‘Prospective’ biobanks assess and take samples from participants at the start of the study, and then follow their health over subsequent years, even decades. Other studies are ‘retrospective’, collecting information and samples from people who have already developed a particular disease, or are family-based genetic studies, which aim to track down genes associated with diseases or other traits.
UK Biobank, a long-term prospective project, was given the full go-ahead in August 2006. Funded by the Wellcome Trust, the Medical Research Council, the Department of Health, the Scottish Executive and the Northwest Regional Development Agency, and hosted by the University of Manchester (with scientific input from more than 20 other British universities), this £61 million project aims to provide the richest source of health-related data and samples for researchers from around the world.
Following a successful test run in March 2006, with 3,800 people taking part from the Altrincham area near Manchester, UK Biobank is now expanding its efforts across the country. Over the next three to four years, it will recruit 500,000 adults aged 40 to 69, nearly 1 per cent of the UK population. (Update: By 2011 this goal had been met, with more than 503,000 people recruited.)
With their fully informed consent, the participants will complete a detailed lifestyle questionnaire, be interviewed about their medical history, have several standard physical measurements (such as blood pressure, body size and lung function), and will donate blood and urine samples. The samples – about 15 million in all – will be stored for decades at ultra-low temperatures, with a purpose-designed robotic system in Cheadle handling samples from up to 1,000 participants every day.
Information about participants’ health will then be obtained, with their permission, from medical and other health-related records. As follow-up continues, medical researchers will be able to compare the lifestyle, genes and other factors among participants who develop some particular disease during long-term follow-up with those among participants who do not. For common conditions, such as heart disease and diabetes, this will be possible within five to ten years of starting the project, whereas for less common diseases it is likely to take much longer before there are sufficient disease cases for reliable analysis.
By measuring many different exposures (not just genes) in large numbers of people, this prospective study will be able to assess the impact of a wide range of factors, alone or in combination, on many different conditions.
As with any project that is built upon the trust and confidence of those who take part, consent, data security and privacy are crucial. Participants will be given detailed information about UK Biobank’s aims and what is required to be involved. In particular, they will be asked to give broad consent for their records and samples to be used for any medical or other health-related research. The data will be stored securely on computer so that all information about participants is well protected. Researchers who use the resource will need to be approved by UK Biobank but can apply from anywhere in the world and from academia or industry.
To protect participants further, the data and samples used by researchers will not include personal identifiers, so that genetic, lifestyle and other factors cannot be traced back to any individual. An independent Ethics and Governance Council has been set up to help monitor, and advise on, the way in which the project is conducted.
There are several other major prospective biobanks worldwide. The largest to date is the European Prospective Investigation into Cancer and Nutrition (EPIC), which was set up to look specifically at the relationship between cancer, genetics and nutrition. Since 1992, the study has recruited 520,000 people in ten European countries (Denmark, France, Germany, Greece, Italy, the Netherlands, Norway, Spain, Sweden and the UK) and has found, for example, that a diet high in fibre reduces the risk of colorectal cancer, as does eating fish, whereas red and processed meat increase the risk.
On a similar scale, the Chinese Kadoorie Study of Chronic Disease is investigating the roles of genetic and environmental factors, such as tobacco, infections and diet, in premature death and disability. Half a million adults aged 35 and over – 50,000 from each of ten rural and urban areas throughout China – will be taking part in the study, and more than 300,000 have already been recruited.
The Mexico City Prospective Study, which began in 1999, has recruited 160,000 men and women aged over 40 from the city’s Coyoacán and neighbouring districts and is looking at the main avoidable causes of chronic diseases. It has collected medical and lifestyle data such as smoking habits, alcohol consumption and diet, as well as blood pressure and blood samples, and is repeating these assessments in subsamples of the group every five years.
In Estonia, a biobank project has been running since 2001 as part of the Estonian Genome Project Foundation. The aim is to create a database of health, genealogy and genome data from a large part of the Estonian population; at present, the biobank contains information from over 10,000 contributors. The project did suffer financial problems when venture capital funding for the scheme ran out in 2005, but the Estonian government subsequently injected €8m (around £5.5m) into the project over four years, enough to raise the number of participants to 100,000.
With other prospective biobanks being discussed or established in several countries – including the USA, Mexico, Singapore, Canada, Norway and Sweden – it may be possible in the future to combine the data from many of these studies into a massive epidemiological meta-database. The more people who can be studied, and the more data that can be analysed, the more robust the statistics; however, as discussed at the ‘From Biobanks to Biomarkers’ conference held in September 2005 (see reference below), there are many issues that need to be addressed before such a plan comes to fruition.
The type of data being collected can vary markedly between studies, as can the type of consent gained from volunteers (which may restrict the routine sharing of data unless it is anonymised). To help address such problems, the Public Population Project in Genomics (P3G), a not-for-profit international group, is working to standardise methodologies and improve coordination across biobanks.
Large-scale prospective biobank projects
Estonian Genome Project
Target number of participants: 100,000
Kadoorie Study of Chronic Disease
Target number of participants: 500,000
Mexico City Prospective Study
Target number of participants: 160,000
European Prospective Investigation into Cancer and Nutrition
Countries: Denmark, France, Greece, Germany, Italy, the Netherlands, Norway, Spain, Sweden and the UK
Target number of participants: 520,000
Country: the UK
Target number of participants: 500,000
A catalogue of large population-based studies around the world can be found at the P3G Observatory. The Mexico City Prospective Study, EPIC and UK Biobank are part-funded by the Wellcome Trust.
A version of this article first appeared in ‘Wellcome Science’ (February 2007).Lead image:
Wellcome Library, London