Pioneering research finds missing pieces in the genomic puzzle

Tech Science 6. jul 2018 2 min Postdoc Jonas Andreas Sibbesen Written by Morten Busch

Today, we can get our genome sequenced for less than DKK 4000 and find out how the small changes in our genome might affect the risk of various diseases. The way computer programs compare genomes has primarily focused on these small changes, but major changes in the genome have often been overlooked. Now Danish researchers have developed a new algorithm that finds the pieces that are often overlooked in the enormous genomic puzzle. This new method is expected to be applied in important ways for the personalized medicine of the future.

Interested in Tech Science? We can keep you updated for free.

Follow Tech Science

Jonas Andreas Sibbesen

Computational and RNA Biology, University of Copenhagen

Follow Jonas

Determining the sequence of a person’s genome is similar to a jigsaw puzzle. The current technology cannot actually decode the entire genome. Instead, it produces a gigantic puzzle comprising billions of small pieces, and advanced algorithms must assemble them before the genetic profile can be decoded.

“Analysing genome sequencing data requires laying each individual piece on top of a set of known pieces, called the reference genome. New pieces, such as genomic insertions, are therefore easily overlooked because placing them correctly on the reference genome is difficult. We have developed a new computer algorithm that creates this genomic reference in 3D. This offers greater opportunities for discovering the complex and often overlooked genomic changes and thus provides a clearer image of the genomic landscape,” explains a main author, Jonas Andreas Sibbesen, Section for Computational and RNA Biology, Department of Biology, University of Copenhagen.

Hard to process the extra pieces

Genome sequencing has become affordable for almost anyone. For a few thousand Danish kroner, people can have their entire genome sequenced and thus obtain information on variants in their genome and how these might affect their risk of developing various diseases such as cancer and metabolic diseases.

“Providing these answers requires advanced computer algorithms that can assemble the genomes and compare them with a standard genome. Paradoxically, the algorithms used so far have primarily discovered the smaller genetic variants in the genome, but the major variants such as genomic insertions have remained a blind spot for researchers.”

One approach to assembling the genomic puzzle involves placing the pieces from the start without knowing the picture portrayed in the puzzle. With billions of pieces, this task is incredibly time-consuming and laborious. This is why the assembly method is seldom used. Mapping is therefore often the preferred method; here the tiny pieces are instead embedded onto a reference genome – a known puzzle. This makes the analysis much easier. However, in areas in which the individual sequenced and reference genomes differ greatly, this technique can result in variants being overlooked.

“For example, we know that there are many variants in the HLA region, which encodes for genes that play key roles in our immune system. The pieces there can differ so greatly from the reference genome that embedding them is almost impossible, resulting in many variants in this region not being visible.”

The researchers’ new algorithm uses a new approach: instead of working with a randomly selected reference genome, genetic variants from many individuals can be used simultaneously.

“This trick provides much greater opportunities to use genetic variants known from previous studies in analysing new individuals, which increases the sensitivity for more complex forms of genetic variation. You could say that, instead of embedding the pieces in a single individual, we embed them in thousands of individuals simultaneously.”

Revealing the dark patches

Genome sequencing data have already revolutionized the opportunities for researchers and doctors to investigate the human genome, and this trend will increase in the future. In Denmark, the GenomeDenmark project has mapped the Danish reference genome, and this was the basis for a research group from the Section for Computational and RNA Biology at the Department of Biology of the University of Copenhagen developing the new and pioneering algorithm.

“In the GenomeDenmark project, we used our algorithm to significantly enlarge the spectrum of genetic variants that can be identified from such data. This especially applied to the more complex variations such as large deletions and insertions in the genome, where we discovered many new and previously unseen variations.”

The ability to better visualize the previously dark patches on the genetic map is expected to be applied in important ways for personalized medicine, in which charting an individual’s genetic profile will play a role in choosing treatment.

“As more and more countries launch these large-scale national genome projects, having algorithms that can give doctors a more complete genetic picture is increasingly essential. The goal is therefore to continually become better at discovering new variations in our genomes because this will probably help in providing more answers as to why we become ill and how we need to be treated.”

“Accurate genotyping across variant classes and lengths using variant graphs” has been published in Nature Genetics. Lasse Maretty and Anders Krogh from the Bioinformatics Centre, Department of Biology, University of Copenhagen are co-authors. The Novo Nordisk Foundation and Innovation Fund Denmark funded the project.

Follow Tech Science

Postdoc

Jonas Andreas Sibbesen

Computational and RNA Biology, University of Copenhagen

Follow Jonas

Current methods for genotyping structural variation, from high-throughput sequencing data, are generally based on comparing the reads to a linear refe...

Tech Science

8. jun 2017 3 min

Pioneering research finds missing pieces in the genomic puzzle

Interested in Tech Science? We can keep you updated for free.

Jonas Andreas Sibbesen

Hard to process the extra pieces

Revealing the dark patches

Jonas Andreas Sibbesen

Related articles

Predicting whether genetic mutations will lead to cancer

Precipitation determines how climate change affects flora and fauna

One step closer to the magical molecules mediating exercise

Genes predisposing for childhood cancer are like bombers from the Second World War

Low birthweight associated with developing earlier onset and more severe type 2 diabetes

New CRISPR technology can determine whether genetic mutations are pathogenic or benign

Circadian rhythm regulates protein activity in the brain

Exciting topics

See all 1019

Parasites 13

Muscles 39

Antibiotics 46

Climate 32

Screening 32

Nanotechnology 28

Cystisc fibrosis 13

Chemistry 79

Future 1

Big data 82

Recycling 4

Podcasts 14

Fertility 19

Evolution 48

Genes 176

Drugs 16

Lungs 21

Virus 89

Organs 25

Computer 37

Fungi 26

Obesity 97

Cholesterol 19

Disease 44

HPV 13

Migraine 9

Stress 29

Psychology 35

Kids 70

Bacteria 117

Dementia 13

Language 7

Gut 46

Ageing 28

Microorganisms 37

Plants 42

Diet 48

Medicine 97

CRISPR 23

Heart 71

Birds 6

Pregnancy 56

Alcohol 27

Depression 28

Treatment 112

DNA 49

Vaccine 46

Chemotherapy 13

Environment 93

Protein 125

Microbiome 30

Biology 25

Schizophrenia 14

Liver 39

Exercise 39

Sleep 22

Eyes 9

Alzheimers 19

Parkinsons 21

Sound 9

Immune system 71

Cells 49

Bones 41

Autism 23