The draft human genome held many surprises for researchers
Printing all 3 billion letters of the human DNA sequence would fill more than 6,000 tomes as hefty as the entire Lord of the Rings trilogy. When the ‘book of life’ was fully decoded, it turned out to be full of surprises.
Most scientists had assumed that humans had around 100,000 genes. Wrong: the true number is nearly 21,000 coding genes. So does this mean that we’re much simpler genetically than – for example – rice, which has nearly 36,000 coding genes? Not necessarily – the human genome harbours some organisational secrets.
Human cells make many times more proteins than they have protein-coding genes. This is achieved through ‘alternative splicing’, which basically means taking genes (or their messenger RNAs) in bits and reshuffling them. The result is rearranged proteins, with different structures.
This makes it harder to define what a gene is. It is no longer a piece of DNA that always does the same thing – ‘one gene, one enzyme’, as biologists used to say. Many genes are shape-shifters, making one product then another. What’s more, many genes can ‘multi-task’ – having different roles in different tissues or at different life stages.
The versatility of genomes is still being revealed. Findings indicate that some ‘jumping genes’, which can move around the genome, are especially active during brain development, for example – which might allow the brain to make new types of neuron (or nerve cells). Nor is the human genome sequence quite complete. Certain regions of chromosomes, usually with highly repetitive sequences, are hard to nail with current technologies.
Newer, so-called ‘second-generation’ sequencing techniques are cheaper per base of DNA than the first-generation techniques used on organisms including human and mouse. Although they are more affordable, using second-generation techniques can leave gaps in genome sequences.
An example given in one PLOS One paper is the Rhesus macaque genome, which has gaps in up to 20 per cent of its gene models. Gene models are proposed descriptions of the gene products and the basis of most research in this area. These gaps are significant because the macaque is an important animal for research. The paper’s authors note: “The scale of the unfinished genome problem will be compounded by new initiatives to sequence 10,000 vertebrate genomes, 5,000 arthropod genomes and 1,000 additional plant and animal genomes.”Lead image:
- A guide to your genome [PDF]
- How big is the human genome?
- Plasma proteomics, the Human Proteome Project, and cancer-associated alternative splice variant proteins (2014)
- The relationship between proteome size, structural disorder and organism complexity (2011) [PDF]
- News: Human genome is much more than just genes (2012)
- News: Brain cell genomes show their individuality (2011)
- On ‘jumping’ genes: Somatic retrotransposition alters the genetic landscape of the human brain (2011)
- The human reference genome – understanding the new genome assemblies
- Repetitive DNA and next-generation sequencing: computational challenges and solutions (2011)