After nearly 20 years, scientists at the Telomere-to-Telomere consortium, or T2T, assisted by UC Berkeley scientists, have completed the Human Genome Project.
UC Berkeley postdoctoral fellow Nicolas Altemose, who co-authored four new papers about the completed genome, said the human genome is the complete set of genetic information inside each human cell, encoded in a four-letter language. The genome itself contains 3 billion letters.
Altemose added that while scientists could successfully read the unique sequences of the genome in 2003, technological limitations left 8% of the genome, consisting of repetitive sequences, incomplete.
“These repetitive regions are akin to puzzle pieces that are all the same color—they lack distinguishing features that help us to identify their exact location,” Altemose said in an email.
For campus assistant professor of bioengineering Aaron Streets, the completion of the project marks the beginning of a genomics era where scientists can now understand the information stored in DNA. Streets said this could lead to answers about how certain diseases are formed from mutations, how cells can properly distribute and duplicate their genome and how human evolution occurred.
Sequencing the genome involves breaking it into smaller pieces, sequencing those pieces and then piecing them back together, according to Altemose. He said limitations in DNA sequencing technologies made the process difficult, especially in repetitive regions.
Technological advancements, paired with new computational methods, have made the process of sequencing large pieces of DNA more efficient, according to Altemose. He added that this has made completing the project possible.
“(We can now) sequence much larger fragments of DNA at a time, greatly simplifying the jigsaw puzzle by reducing the number of pieces that the puzzle is split into,” Altemose said in the email.
According to Streets, besides improved technologies, collaboration among scientists worldwide at T2T made the completion of the project possible.
Streets’ lab at UC Berkeley created a new technology called DiMeLo-seq, which maps protein-DNA interactions in repetitive regions, according to Altemose.
Moreover, the project’s completion revealed information about the centromere, which is where chromosomes are grabbed and pulled apart, Altemose added.
“Moving forward, we hope to assemble these repetitive centromeric regions from many different individuals, to get a better grasp on the high genetic diversity that we know exists in these regions,” Altemose said in the email.
Additionally, Altemose noted researchers aim to sequence genomes from genetically diverse individuals, with the goal of building a reference sequence representative of the human population.
Altemose affirmed that this milestone in the field of genetics will accelerate our understanding of human health and disease.
“Now that we can see the genome more clearly, we can start to answer questions that we never could before,” Streets said. “What’s in there, you know?”