Researchers have successfully deciphered a whole human genetic instruction manual from beginning to end. The completion of the human genome has been declared a few times in the past, but those were only rough versions. “This time, we’re serious,” says Evan Eichler, a human geneticist and Howard Hughes Medical Institute investigator at the University of Washington in Seattle.
A worldwide team of academics, including Eichler, utilized modern DNA sequencing technology to untangle repeating sequences of DNA that had been deleted from an earlier version of the genome, which is widely used as a reference for biomedical research.
Researchers estimate in Science that deciphering such difficult stretches adds around 200 million DNA bases to the instruction book, accounting for about 8% of the genome. That’s the equivalent of a chapter. And it’s a tasty one, with the first-ever glimpse at some chromosomes’ short arms, long-lost genes, and crucial chromosome sections called centromeres – where equipment responsible for dividing up DNA grips the chromosome.
“Some of the missing parts turn out to be the most intriguing,” says Rajiv McCoy, a human geneticist at Johns Hopkins University who was part of the Telomere-to-Telomere (T2T) Consortium, which assembled the entire genome. “It’s thrilling because we get to explore these places for the first time to see what we can uncover.” Telomeres are repetitive stretches of DNA found at the ends of chromosomes. Like aglets on shoelaces, they may help keep chromosomes from unraveling.
Data from the effort are already available for other researchers to explore. And some, like geneticist Ting Wang of Washington University School of Medicine in St. Louis, have already delved in. “Having a complete genome reference definitely improves biomedical studies.… It’s an extremely useful resource,” he says. “There’s no question that this is an important achievement.”
But, Wang says, “the human genome isn’t quite complete yet.”
To understand why and what this new volume of the human genetic encyclopedia tells us, here’s a closer look at the milestone.
Some of the missing parts turn out to be the most intriguing. It’s thrilling because we get to explore into these places for the first time to see what we can uncover.
Rajiv McCoy
Eichler emphasizes that “this represents the culmination of a human genome.” The term “human genome” does not exist. Large sections of any two people’s genomes will be quite similar to essentially identical, with “smaller portions that are dramatically different.” A reference genome can assist researchers in determining where people differ, which can lead to the identification of genes that may be involved in disorders. Having a complete picture of the genome, with no gaps or buried DNA, may help scientists better comprehend human health, disease, and evolution.
Unlike the previous human reference genome, the freshly completed genome has no gaps. However, Wang points out that it still has limits. The ancient reference genome is a mash-up of more than 60 different people’s DNA. “Not a single person or cell on this planet has that genome.” That also applies to the new, entire genome. “It’s a quote-unquote bogus genome,” Wang, who was not involved in the effort, says.
The new genome is also not derived from a person. It’s the whole genome of a hydatidiform mole, a type of tumor that forms when a sperm fertilizes an empty egg and the father’s chromosomes are duplicated. The researchers decided to read the entire genome using a cell line called CHM13 derived from one of these rare cancers.
According to geneticist Karen Miga of the University of California, Santa Cruz, the decision was taken for technical reasons. People typically inherit one set of chromosomes from their mother and another set from their father. As a result, “we all have two genomes in every cell.”
If building a genome is analogous to putting together a puzzle, “you effectively have two puzzles in the same box that seem quite similar to each other,” Miga adds, adopting an analogy from a colleague. Before fitting the two puzzles together, researchers would have to sort them. “Hydatidiform mole genomes do not face the same issue.” It’s only one of the puzzles in the box.”
Because the hydatidiform mole’s sperm carried an X chromosome, the researchers had to add the Y chromosome from another person.
Even putting together a single puzzle is a Herculean undertaking. However, recent technologies that allow researchers to sequence DNA bases (represented by the letters A, T, C, and G) can produce stretches of more than 100,000 bases. Just as larger and fewer pieces make children’s puzzles easier to solve, these “long reads” made assembling the bits of the genome easier, especially in repeating areas where only a few nucleotides can distinguish one copy from another. The larger portions also enabled researchers to fix several errors in the previous reference genome.
What did they find?
To begin, the newly deciphered DNA includes the short arms of chromosomes 13, 14, 15, 21, and 22. These “acrocentric chromosomes” do not look like lovely, neat Xs like the rest of the chromosomes. They instead have a pair of long arms and a pair of nubby short arms.
The modest arms’ length belies their significance. These arms contain rDNA genes that encode rRNAs, which are essential components of sophisticated molecular machinery known as ribosomes. Ribosomes read genetic instructions and construct all of the proteins required for cells and bodies to function. Every person’s genome has hundreds of copies of these rDNA sections, with an average of 315, but some people have more and some have fewer. They’re important for making sure cells have protein-building factories at the ready.
“We had no idea what to expect in these areas,” Miga explains. “We discovered that every acrocentric chromosome and every rDNA on that acrocentric chromosome contained variations, modifications to the repeat unit that was exclusive to that chromosome.”
Using fluorescent tags, Eichler and colleagues showed that repetitive DNA adjacent to rDNA regions — and maybe the rDNA itself — occasionally moves positions to arrive on another chromosome, as reported in Science. “It’s like a game of musical chairs,” he explains. Why and how this occurs remains a mystery.
The complete genome also contains 3,604 genes, including 140 that encode proteins, that weren’t present in the old, incomplete genome. Many of those genes are slightly different copies of previously known genes, including some that have been implicated in brain evolution and development, autism, immune responses, cancer and cardiovascular disease. Having a map of where all these genes lie may lead to a better understanding of what they do, and perhaps even of what makes humans human.
One of the most significant discoveries may be the structure of all human centromeres. Centromeres, the pinched sections of most chromosomes that give them their typical X shape, serve as assembly places for kinetochores, the cellular machinery that divides up DNA during cell division. One of the most crucial roles in a cell. When it goes wrong, it can lead to birth abnormalities, cancer, or death. Researchers had previously deciphered the centromeres of fruit flies as well as the human 8, X, and Y chromosomes, but this is the first time they have seen the rest of the human centromeres.
The structures are predominantly alpha satellites, which are head-to-tail repeats of around 171 base pairs of DNA. However, those repetitions are nested between other repeats, resulting in complicated patterns that designate each chromosome’s distinct centromere, as described by Miga and colleagues in Science. Knowing the structures will help researchers learn more about how chromosomes are divided and what can go wrong.
In addition, researchers now have a more complete map of epigenetic marks, which are chemical tags on DNA or linked proteins that can modify how genes are regulated. Winston Timp, a biomedical engineer at Johns Hopkins University, and colleagues write in Science that one sort of epigenetic imprint, known as DNA methylation, is rather abundant across centromeres except for one location on each chromosome termed the centromeric dip region.
The researchers observed that such dips are where kinetochores grasp the DNA. However, it is unclear whether the drop in methylation causes the cellular machinery to assemble in that location or whether the assembly of the machinery results in lower levels of methylation.
Examining DNA methylation patterns in multiple people’s DNA and comparing them with the new reference revealed that the dips occur at different spots in each person’s centromeres, though the consequences of that aren’t known.
Approximately half of the genes associated in the evolution of humans’ big, wrinkled brains are located in many copies in the genome’s newly discovered repetitive regions. According to Ariel Gershman, a geneticist at Johns Hopkins University School of Medicine, overlaying the epigenetic maps on the reference allowed researchers to determine which of many copies of those genes were turned on and off.
“That offers us a little bit more insight into which of them are actually crucial and playing a functional role in the evolution of the human brain,” adds Gershman. “That was intriguing for us because there had never been a reference that was exact enough in these [repetitive] regions to distinguish which gene was which and which ones were turned on or off.”