GENSCALE : Scalable, Optimized and Parallel Algorithms for Genomics

The GenScale team works in close connection with biologist colleagues to propose algorithms and their implementations to process large genomic data generated by DNA sequencing technologies. Those data are error-prone, scattered, and massive (terabytes of sequences generated within a few days). In this context, GenScale members focus on three main axes:

Analyzing complex features

The team proposes novel approaches to detect genomic variants and to precisely assemble the genome or the chromosome sequences. The ultimate goal is to obtain one sequence per sequenced chromosome or species, together with their associated variations. Techniques are based on algorithms on strings, on graph analyses, on data representation, on linear programming, and ASP solvers.

Exploring and Querying

To scale up the amount of data to be treated, the team proposes new methodological solutions based on advanced data-structures to index and screen large genomic databanks, enabling the detection of specific markers attached to diseases, to the genomics analysis of thousands of full genomes, and to the analyses of gut microbiomes. Techniques are based on data indexation, data correction, and again on algorithms on strings and graphs.

Explore the problem of archiving large volumes of data on DNA molecules, involving problematics such as the development of specific DNA file system, error-correcting codes, information security, DNA synthesis, DNA sequencing, data genomic treatment, etc.

Creation date

01/01/2013

Reporting institution

Université de Rennes 1, Inria, CNRS

Location

Campus de Beaulieu, RENNES (35)

Department

D7 - Data and knowledge management

Activity reports

Attachment	Size
GENSCALE-RA-2022.pdf	539.59 KB
GENSCALE-RA-2021.pdf	531.96 KB
genscale2019.pdf	449.53 KB
genscale2018.pdf	450.23 KB
genscale2017.pdf	453.04 KB