Archives for posts with tag: Human Genome Project

In the year 2000 the draft human genome sequence was announced by Tony Blair and Bill Clinton. It was said to be complete in 2003, in time for the 50th anniversary of the discovery of the structure of DNA. Well actually it wasn’t quite finished. Actually it’s still not finished. Besides the tweaking that still goes on here and there, there are still big gaps. And what’s in these gaps sometimes has a significant role in cancer.

The biggest gaps are centromeres. Centromeres and telomeres are made of what is known as repetitive DNA, and this is hard to sequence. There’s more detail below.

Cancer chromosomes often have centromere or telomere abnormalities. In fact these abnormalities can cause cancer or make it progress faster.

In cancer research there’s a big push to sequence the genomes of different types of cancer  to try and understand the many different DNA changes that can cause cancer. Some researchers try to understand telomeres and centromeres and their role in cancer, but in cancer sequencing projects, and also in diagnostics, centromeres and telomeres are pretty much ignored. Although they’re difficult to sequence, the repetitive DNA does make them easy to study by some other techniques.

One of the goals of personalised medicine is to be able to read a person’s complete genome. For cancer this would include the abnormal cancer genome. But at the moment these gaps mean that we can’t describe the abnormal cancer chromosomes from end to end by sequencing them. The approach I use allows me to work out what’s in each chromosome and discover telomere and centromere abnormalities.


By looking at things that most people don’t worry about I’ve overturned a few assumptions and made some unexpected discoveries, particularly about centromeres in leukaemia.

A normal chromosome has one centromere. I found that chromosomes with two centromeres are more common in  acute myeloid leukaemia (AML) and myelodysplastic syndromes (MDS) than was thought. I found that most of these chromosomes with two centromeres were probably made by two chromosomes joining together.

Telomeres are at the ends of chromosomes and stop them from sticking to each other, so when the telomeres are eroded,  chromosomes can join together. They can be eroded by exposure to chemical toxins, cancer drugs and radiation. So it’s interesting that the leukaemias that are caused by these exposures have more of these two-centromered chromosomes than other leukaemias.

centromeres - dicentric

Fluorescent DNA probes can label up centromeres (blue) and genes (red). In this image there is also a chromosome 20 paint – the green regions are from chromosome 20.


Telomeres and centromeres are made of highly repetitive DNA and make up some of the gaps in the human genome sequence.

Sequencing a genome is like reading a story. But first we cut the book up into tiny fragments. We read them piece by piece, then try to join the pieces together to make the story, by matching the overlapping parts. Where this approach falls apart is that some sections are repetitive. Some pages are made up of a single word or phrase repeated over and over and over and over and over and over (I won’t repeat that hundreds of thousands of times, but you should get the picture). So if a lot of fragments just say the same thing “over and over and over”, it’s very hard to put them together meaningfully.

The centromere guides the chromosome to the two daughter cells during cell division. A normal chromosome has one centromere. When a chromosome has two centromeres (we call this a dicentric chromosome), the chromosome can be pulled in opposite directions, breaking the chromosome and causing more chromosome disorganisation.Telomeres cap the ends of normal healthy chromosomes. One of their functions is to stop the chromosomes from sticking to each other. So when the telomeres are lost or eroded the chromosomes can join together. That’s one way of creating a chromosome with two centromeres.

Telomere loss is a natural part of ageing. There are also many environmental and lifestyle factors that are thought to affect telomere length. Short telomeres are thought to be a cancer risk because dicentric chromosomes are more likely to arise.

Telomeres and centromeres are very important parts of a normal chromosome. You could say they hold the chromosome together. They have a lot of influence on whether chromosomes are normal and stay normal.


Murnane JP 2012. Telomere dysfunction and chromosome instability. Mutat Res. 2012 Feb 1;730(1-2):28-36. (Open access)

MacKinnon RN and Campbell, LJ. 2011. The role of dicentric chromosome formation and secondary centromere deletion in the evolution of myeloid malignancy. Genetics Research International. Article ID 643628. (Open access)

MacKinnon RN, Duivenvoorden HM and Campbell LJ. 2011. Unbalanced translocation of 20q in AML and MDS often involves interstitial rather than terminal deletion of 20q. Cancer Genet. 204(3):153-61.


When it was initiated by the US Department of Energy in 1987, the Human Genome Project was an ambitious, some said impossible, endeavour.  It was all about producing a representative readout of the human genome – that is, the whole set of human DNA.

At the time DNA was read (sequenced) manually – scientists read each letter of the code off an X-ray film. One by one genes were laboriously found and sequenced. Finding and characterising a gene in this way was a whole PhD project, if you got lucky and actually found the gene (finding a gene by family studies was harder than starting with a known protein).

Humans have some 3,000,000,000 letters (base pairs) in their genome, so a faster approach was needed to get the project finished in the planned 15 years.

The Human Genome Project encouraged the development of much faster automated sequencing. I attended the 1991 Cold Spring Harbor Genome Mapping and Sequencing Meeting and there were several examples of exciting new automated sequencing prototypes, which used a range of different approaches.

photo 2

Abstract book from the 1991 Genome Mapping and Sequencing Cold Spring Harbor meeting.
Sample abstracts: “Capillary gel electrophoresis for DNA sequencing – comparison of three different approaches” (HP Swedlow et al); “Library of 256 hexamers, degenerate at two positions (5′-NNXXXX-3′), can create all possible 12-mer primers for applications in high-volulme DNA -sequencing strategies” (D.Shoemaker et al)

Now we can take it for granted that we can look up a gene on the internet. Having a human genome sequence was going to make a big contribution to health care. It is already helping, and will play a bigger role as we learn more about what roles the various genes play. For example working out what genes are playing a role in cancer will become more routine.

We can already do a lot for some cancer patients by doing genetic tests on their cancer. There are many categories of leukaemia that have a very specific type of DNA abnormality, and knowing what gene is involved can help diagnose and treat the disease appropriately.

Chromosome abnormalities helped make some of the earliest cancer gene discoveries. That’s because the gene abnormalities that cause some cancers are caused by microscopically visible changes to the chromosomes, which pinpoint the cancer gene. The poster child for this is chronic myeloid leukaemia. Most cases of CML have a chromosome abnormality known as the Philadelphia translocation. In fact this was the first cancer chromosome abnormality to be discovered. Imatinib (Glivec/Gleevec/STI-571) was one of the first targeted cancer drugs. Designed to lock onto the molecule produced by the cancer gene, it targets the leukaemia cells containing the Philadelphia chromosome. It’s made a huge improvement to the outlook for CML patients.

But for most cancers we’re not so lucky – the cancer-causing genes are not usually so obvious or easy to identify. Most cancers have their own individual combination of genetic errors, and what’s more, the genome changes as the cancer grows more aggressive and spreads. Sequencing of whole cancer genomes could become standard practice in cancer treatment, as a way of understanding each cancer and selecting treatment that targets its specific genetic changes. First we will need to be able to read a complete genome quickly and cheaply. We’re not there yet. But we’re on the way. Compared to 15 years for one representative genome, that’s impressive.

Next time: The Human Genome Project was said to be complete in 2003, in time for the 50th anniversary of the discovery of the structure of DNA. Actually it’s still not finished. Most of the gaps are regions that are very relevant to cancer.

Wellcome_genome_bookcase 2

The first printout of the human genome to be presented as a series of books, displayed in the ‘Medicine Now’ room at the Wellcome Collection, London. The 3.4 billion units of DNA code are transcribed into more than a hundred volumes, each a thousand pages long, in type so small as to be barely legible. From Ross London et al en.wikipedia.