Twenty years after scientists from the Human Genome Project produced the first draft human genome sequence, the Telomere to Telomere (T2T) Consortium has now unveiled a complete version. According to the T2T researchers, having a complete, gap-free sequence of the roughly three billion base pairs in our DNA is critical for understanding the full spectrum of human genomic variation and for understanding the genetic contributions to certain diseases. They have already discovered more than two million additional variants in the human genome and provided more accurate information about the genomic variants within 622 medically relevant genes.
The human genome is made up of just over 6 billion individual letters of DNA — about the same number as other primates like chimps — spread among 23 pairs of chromosomes.
To read a genome, scientists first chop up all that DNA into pieces hundreds to thousands of letters long.
Sequencing machines then read the individual letters in each piece, and researchers try to assemble the pieces in the right order, like putting together an intricate puzzle.
One challenge is that some regions of the genome repeat the same letters over and over again. Repetitive regions include the centromeres, the parts that hold the two strands of chromosomes together and that play crucial roles in cell division, and ribosomal DNA, which provides instructions for the cell’s protein factories. Still other repetitive parts include new genes that may help species adapt.
In the past, all that repetition made it impossible to assemble some chopped-up pieces in the correct order. It’s like having identical puzzle pieces — scientists didn’t know which went where, leaving big gaps in the genomic picture.
Another snag: most cells contain two genomes — one from the father and one from the mother. When scientists try to assemble all the pieces, sequences from each parent can mix together, obscuring the actual variation within each individual genome.
“Generating a truly complete human genome sequence represents an incredible scientific achievement, providing the first comprehensive view of our DNA blueprint,” said Dr. Eric Green, director of the National Human Genome Research Institute.
“This foundational information will strengthen the many ongoing efforts to understand all the functional nuances of the human genome, which in turn will empower genetic studies of human disease.”
The now-complete human genome sequence will be particularly valuable for studies that aim to establish comprehensive views of human genomic variation, or how people’s DNA differs.
Such insights are vital for understanding the genetic contributions to certain diseases and for using genome sequence as a routine part of clinical care in the future.
The full sequencing builds upon the work of the Human Genome Project, which mapped about 92% of the genome, and research undertaken since then.
Thousands of researchers have developed better laboratory tools, computational methods and strategic approaches to decipher the complex sequence.
That last 8% includes numerous genes and repetitive DNA and is comparable in size to an entire chromosome.
The T2T researchers generated the complete genome sequence using a human cell line with only one copy of each chromosome, unlike most human cells, which carry two copies of each chromosome.
They noted that most of the newly added DNA sequences were near the repetitive telomeres (long, trailing ends of each chromosome) and centromeres (dense middle sections of each chromosome).
“Ever since we had the first draft human genome sequence, determining the exact sequence of complex genomic regions has been challenging,” said Dr. Evan Eichler, a researcher at the University of Washington and T2T consortium co-chair.
“I am thrilled that we got the job done. The complete blueprint is going to revolutionize the way we think about human genomic variation, disease and evolution.”
The cost of sequencing a human genome using short-read technologies, which provide several hundred bases of DNA sequence at a time, is only a few hundred dollars, having fallen significantly since the end of the Human Genome Project.
However, using these short-read methods alone still leaves some gaps in assembled genome sequences.
The massive drop in DNA sequencing costs comes hand-in-hand with increased investments in new DNA sequencing technologies to generate longer DNA sequence reads without compromising the accuracy.
Over the past decade, two new DNA sequencing technologies emerged that produced much longer sequence reads.
The Oxford Nanopore DNA sequencing method can read up to 1 million DNA letters in a single read with modest accuracy, while the PacBio HiFi DNA sequencing method can read about 20,000 letters with nearly perfect accuracy.
The T2T team used both DNA sequencing methods to generate the complete human genome sequence.
“Using long-read methods, we have made breakthroughs in our understanding of the most difficult, repeat-rich parts of the human genome,” said Dr. Karen Miga, a researcher at the University of California, Santa Cruz and T2T consortium co-chair.
“This complete human genome sequence has already provided new insight into genome biology, and I look forward to the next decade of discoveries about these newly revealed regions.”
“In the future, when someone has their genome sequenced, we will be able to identify all of the variants in their DNA and use that information to better guide their healthcare,” said Dr. Adam Phillippy, a researcher at the National Human Genome Research Institute and T2T consortium co-chair.
“Truly finishing the human genome sequence was like putting on a new pair of glasses. Now that we can clearly see everything, we are one step closer to understanding what it all means.”
The package of six papers reporting the accomplishment appears this week in the journal Science, along with companion papers in several other journals.
Sergey Nurk et al. 2022. The complete sequence of a human genome. Science 376 (6588): 44-53; doi: 10.1126/science.abj6987
Ariel Gershman et al. 2022. Epigenetic patterns in a complete human genome. Science 376 (6588); doi: 10.1126/science.abj5089
Mitchell R. Vollger et al. 2022. Segmental duplications and their variation in a complete human genome. Science 376 (6588); doi: 10.1126/science.abj6965
Savannah J. Hoyt et al. 2022. From telomere to telomere: The transcriptional and epigenetic state of human repeat elements. Science 376 (6588); doi: 10.1126/science.abk3112
Sergey Aganezov et al. 2022. A complete reference genome improves analysis of human genetic variation. Science 376 (6588); doi: 10.1126/science.abl3533
Nicolas Altemose et al. 2022. Complete genomic and epigenetic maps of human centromeres. Science 376 (6588); doi: 10.1126/science.abl4178
Source link: https://www.sci.news/genetics/complete-human-genome-10676.html