A reference-free solution to the viral quasispecies assembly problem

Séminaire

Date de début

jeu 18/02/2021 - 14:30

Date de fin

jeu 18/02/2021 - 16:00

Lieu

Webminaire

Orateur

Jasmijn Baaijens (Harvard Medical School)

Département principal

D7 - Gestion des données et de la connaissance

The viral quasispecies assembly problem deals with assembling fragments stemming from the RNA from the ensemble of virus strains that have populated individual patients, into full-length strain-resolved genomes. The challenge is that RNA viruses are affected by high mutation rates, which implies that very often strain-specific genomes substantially diverge from existing reference genomes. We developed a collection of tools that together provide a reference-free solution to the viral quasispecies assembly problem. We show how to assemble strain-specific contigs using overlap graphs; then, we construct variation graphs from these contigs and define a flow-like optimization problem to build full-length, strain-specific genomes, along with estimates for their relative abundance. Benchmarking experiments show that our workflow outperforms state-of-the-art approaches on mixed samples from viral genomes in terms of assembly accuracy as well as abundance estimation. Experiments on longer, bacterial sized genomes demonstrate future applications also in bacterial genomics.

If you wish to be informed and to attend our seminars, please contact us : seminaire-irisa-d7-contact-request /at/ inria.fr