DNA fragments from humans, animals, plants, bacteria and viruses that lived many thousands of years ago are not very useful unless advanced algorithms compile the genetic information from the fragments into a format that archaeologists or geologists can understand. Gabriel Renaud, an Associate Professor at the Technical University of Denmark, develops these algorithms.
Within the past decade, the world has been pampered with one great discovery after another about the lives and movements of our ancestors.
Only recently, researchers discovered that people lived in the Americas much earlier than previously thought; that the Vikings had a mishmash of genes from all over Europe: and that smallpox epidemics were already rampant in the Viking Age.
Although all these discoveries sound archaeological in nature, behind each is a strong foundation of advanced algorithms and solid computational power that can analyse millions of tiny fragments of genetic material and that enable the genetics of the past to be examined in greater detail.
One researcher who has helped to pioneer some of the algorithms on which archaeologists rely today is Gabriel Renaud, an Associate Professor at the Department of Health Technology of the Technical University of Denmark.
The Novo Nordisk Foundation recently awarded Gabriel Renaud a Data Science Investigator Grant of almost DKK 8 million to improve the algorithms behind the revolutionary discoveries even more over the next 5 years.
“Archaeologists have a lot of questions that cannot be answered by simply examining bones. They want to know who people were, how they lived, what the environment was like at the time and what diseases they had. My research aims to develop the computer tools that enable archaeologists to answer these questions,” explains Gabriel Renaud.
New algorithms answer age-old questions
A way to exemplify Gabriel Renaud’s research is envisioning that a very old bone has been found in a cave somewhere in southern Europe.
To determine more about the person who had the bone, researchers can perform next-generation sequencing and extract genetic material.
However, if the bone is 30,000 years old, the genetic material has degraded into millions of tiny fragments of DNA of varying lengths. Decoding the genetic information is therefore like reading a book that has been run through a shredder a few times. It cannot be decoded unless you reconstruct the book – which is much easier said than done.
Perhaps the shredded fragments will contain whole sentences or merely words, but that does not tell you how the words or sentences fit together. Do they belong on page 3 or on page 256?
This is where Gabriel Renaud and other computer scientists are stepping in to develop advanced algorithms that can compile the many millions of fragments of genetic information into meaningful information.
In the book analogy, computer scientists compile the fragmented book pages from several hundred books into a single or perhaps a handful of copies that archaeologists can browse through to advance their knowledge about the past.
Computers handle everything, and researchers either reconstruct the entire genome from scratch or recreate it by relying on a reference genome, such as a human genome, if the bone comes from a human.
Once Gabriel Renaud’s algorithm has processed the genome, researchers can, for example, determine whether the bone belonged to a man or a woman or the ethnicity of the person.
The researchers did this when they discovered that Vikings had a mishmash of genes from all over Europe, which the reconstructed genome showed them.
“Not more than a decade ago, researchers used these methods on an ad hoc basis and not very effectively. But there is considerable potential in developing the algorithms and making them more robust and accurate. This will enable researchers to find even more answers based on the DNA that is extracted from a wide variety of sources,” says Gabriel Renaud.
Algorithms need to manage many scenarios
Gabriel Renaud is working on developing algorithms that consider many situations in which current algorithms fall completely or partly short.
For example, algorithms can be improved to filter out contamination from samples.
If a bone has been lying in a cave for 30,000 years, it may be contaminated with DNA from other organisms, and this must be filtered out through computational means. Gabriel Renaud has developed an algorithm that can do this.
Gabriel Renaud has also developed an algorithm that can identify whether, for example, an individual living 30,000 years ago was a product of an excessively close relationship between family members.
“For example, if cousins had a child together, the child’s chromosomes may not vary, since they both come from the same grandparents. This is an evolutionary dead end because it dramatically increases the risk of genetic disorders and defects. An algorithm can identify this trait, and this enables archaeologists to be more knowledgeable about how people might have lived 30,000 years ago. What was normal and abnormal?” explains Gabriel Renaud.
Examining DNA fragments in a soil sample
Archaeologists, geologists and molecular biologists are now testing the algorithms developed by Gabriel Renaud.
In the past, researchers only extracted ancient DNA primarily from bones and teeth, but they have increasingly begun to seek other ways to obtain more information. For example, they have begun sequencing all genetic material from soil samples taken from ancient cave floors.
So when they ask Gabriel Renaud to identify what lived in the cave, he receives data on DNA fragments from not just one species but from many species that lived in the cave, including humans, animals, plants, bacteria and viruses.
This is like stuffing a blender with wheat, strawberries, almonds, a finger from a distant uncle, half a wild boar, two field mice, 12 bacteria species, six viruses and a beech tree, then pressing the start button and letting it run for 30,000 years.
“My task is to discover what lived in the cave at a given time, which is not simple. Nor does the task become easier because DNA is damaged over time, and the algorithms must also consider this. All these aspects need to be coded into the algorithms for them to provide the answers that archaeologists or geologists want,” says Gabriel Renaud.
Predicting how climate change will affect animals
Gabriel Renaud’s constantly evolving algorithms enable answers to some very specific questions that are not only of retrospective interest but are also interesting for the future.
For example, the algorithms can identify the composition of animals and plants in a given place in Europe 100,000, 75,000, 50,000 and 25,000 years ago.
During this period, the climate alternated between being hot and cold, so the researchers can compare climate data from, for example, ice cores from the Greenland ice sheet versus the genetic discoveries in the caves and compile a picture of how animals and plants reacted and responded to climate change.
Some species may do well, and these are present in DNA fragments in all the samples processed by Gabriel Renaud’s algorithms, whereas others are only found in samples that correspond to cold or hot periods.
“We can monitor trends in the composition of fauna and flora in real time during another epoch in history. Compared with current climate trends, this can help us to determine which animals may be most severely affected by climate change and thereby on which animals we will need to focus most intensely in conservation terms,” explains Gabriel Renaud.
Seeking causes for the extinction of large mammals
A completely different perspective is that the algorithms can also help to answer some of the great unanswered questions related to human influence on the rest of the planet.
For example, researchers are not totally certain whether the driving factor in the extinction of many of the great mammals from Europe and Asia was humans or climate. What happened to the mammoths and the woolly rhinos? And what about the sabre-toothed tigers?
This coincides with humans arriving in Europe from Africa, but is this the cause?
“We need more data to answer this question, and we can obtain this by analysing the presence of DNA from some of these animals in caves and other locations. Then we can discover when they lived, when they disappeared again and whether this coincides with the fact that there were people in a specific area,” explains Gabriel Renaud.
Gabriel Renaud also explains that caves are an especially excellent source of genetic material, because they often maintain a relatively constant temperature. Heat is the worst enemy of DNA because then DNA degrades faster than ice melting on a summer’s day.
“Million-year-old DNA has been discovered in ice, which is really good for preserving DNA, but the time limit for DNA that has not been frozen is about 100,000 years,” says Gabriel Renaud.
Re-emergence of bacteria in permafrost
Another perspective of Gabriel Renaud’s research on developing algorithms relates to identifying prehistoric bacteria and viruses.
According to Gabriel Renaud, bacteria and viruses are not just part of our history and prehistory but have been instrumental in shaping it. This applies to the bacteria and viruses that coexist peacefully with us but also to pathogenic bacteria and viruses.
We have not been in contact with some of these pathogenic bacteria and viruses for thousands of years, and that means we have no defence against them if they re-emerge one day. The obvious question is whether this can happen. The answer is yes, and that puts climate change back in focus.
For example, under the permafrost in northern Siberia are grasses, bacteria and viruses that have not seen the light of day for 30,000 years. They just lie there waiting. Now the ice has begun to thaw, and bacteria and viruses have begun to re-emerge like prehistoric zombies rising from an icy tomb. Russian reindeer breeders have had to cull entire herds because they have been infected with the very lethal anthrax bacterium.
Researchers think that anthrax has re-emerged from its slumber after being dormant in a permanently frozen state for virtually the whole time that people have lived outside Africa. Now it is re-emerging in a world that has no protection at all against it.
“When you think about COVID-19 and how a tiny fragment of DNA can ruin global economies, we should be very interested in determining what lies hidden under the ice in various parts of the world. The bacteria and viruses are now waking up, and there is no guarantee that plants, animals or humans can defend against them. This is a very dangerous cocktail,” says Gabriel Renaud.
Developing algorithms to identify what is under the ice
Part of Gabriel Renaud’s research involves developing algorithms that can process the enormous quantity of genetic data that can be extracted from ice core samples from the Russian permafrost.
If the bacteria resemble some that we already know, he can use them as references and reconstruct complete bacteria and virus genomes.
If they are unlike anything we know, his algorithms must be adapted to build the genomes from scratch.
Regardless of which method this involves, the aim is to discover the properties of bacterial genomes, because only then can we find out how potentially infectious they are and how dangerous they can be.
“We want to discover what these bacteria and viruses are capable of, because it’s not a question of whether they will thaw but when,” concludes Gabriel Renaud.