Scientific axes
Bioinformatics has a quite large meaning and we first delimit the restricted meaning we use in our framework: it specifies research at the interface between computer science and molecular biology (also called computational biology) and not all "standard" informatics that is necessary to manage biological data on a daily basis. Note however that our experience – common to many bioinformaticians – is that it is hard to achieve in depth research in this domain without "biocomputing", that is, participating to services of the second kind with biologists.
The Scientific axes on which the project focuses derive from
our choice on modeling complex biological systems in a linguistic and
logical framework.
More precisely, the project links together three main directions.
Analysis of structures in sequences
This track concerns the search for relevant (e. g. functional) spatial or logical structures in macromolecules, either with intent to model specific spatial structures (secondary and tertiary structures, disulfide bounds ... ) or general biological mechanisms (transposition ... ). In the framework of language theory and combinatorial optimization, we try to answer three types of problems
the design of grammatical models on biological sequences;
efficient model matching in data banks;
maching learning of grammatical models from sequences.
We have an interest in both theoretical questions (language representations, search space) and practical questions (how to implement efficient parsers, how to infer language representations from a sample of sequences?). We follow a combinatorial approach. Corresponding disciplinary fields are algorithmic on words, machine learning, data analysis and combinatorial optimization.
Parallelism for bioinformatics
The fast access to millions of genomic
objects has become a central scientific challenge.
We investigate the usage of parallelism to speed up computations in
genomics. Topics of interest
range in intensive sequence comparisons to pattern or model matching,
including structure
prediction. We work on the design of hardware architectures tailored to
the treatment of such applications. It is mainly based on the study of
reconfigurable machines employing Field Programmable Logical Arrays
(FPGA). Other activities concern GRID computing and parallelization of
optimization algorithms.
Gene expression data: analysis and network modeling
The first purpose of analysis of biological sequences is to characterize
each gene individually and to
explore gene regulations by means of identifying regulatory
cis-elements. But the ultimate goal, for the biologist, is to explain
how the combination of genetic and metabolic interactions
determines the phenotype which is observed at the molecular level, particularly
in case of diseases.
The scarcity of quantitative data on biological phenomena implies the
use of qualitative models. Our approach is based on the definition
of graph models of biological networks and the derivation
of discrete or differential models for explaining
and predicting (in a broad meaning) the behavior of
the biological system.
This research is rooted in various fields: data analysis, graph
theory, discrete event systems, qualitative theory of differential systems.