ReMIX Project     

A Reconfigurable Memory for Indexing Mass of Data      

General
Overview
Project Summary
Publications
Intranet
People
Architecture
System
RMEM board
Programming
Framework
Operator Synthesis
Applications
Genomics
Images
Text
Support
Overview
ACI ReMIX
ARC INRIA
West Genopole
      
ARC INRIA

Comparing 700 000 bacterium proteins vs the Human Genome

The Inserm U694 laboratory is involved in the mitochondrial diseases. The strategy is to perform an in-silico study to locate on the human genome potential mitochondrial proteins. As the mitochondry may originate from ancestral bacteria, a systematic comparison with the proteom of all available bacteria must be done.

From a computational point of view, this is equivalent to perform a tblastn treatment of 700,000 proteins against the human genome. The computation time has been estimated to about 1 year on the Inserm U694 server.

Results
A tblastn-like program has been implemented. The indexing sheme is based on blast-like seeds and acts as a reference. The size of the index represent about 40 times the size of the human genome raw data (about 90 Gbytes).

A reconfigurable operator implementaing the time consumming part of the tblastn process has been designed. It houses 160 small dedicated processors working in parallel. With a single ReMIX board, the complete human genome against the bacterial proteom (700 000 proteins) has been processed in 10 days.

Based on the algorithmic enhancements provided by the design of new seeds we can now expect a reduction of 25% on both the computation time and the ReMIX FLASH occupancy. It can also be pointed out that these results can be generalized to standard computers, especially for multicore architectures.

Publications
  contact:

  Dominique Lavenier
  lavenier@irisa.fr

  http://www.irisa.fr/remix