Scientific Axes
Activity Report

Rayan CHIKHI Print

    Mail chikhi[at]irisa.fr




This page is no longer up to date. My new home page is: http://rayan.chikhi.name





I completed my PhD in Computer Science at ENS Cachan, Brittany campus (wikipedia) between 2008-2012, under the supervision of Dominique Lavenier.

Research Interests:

Computational theory for de novo assembly of short DNA sequencing reads.

  • Algorithms and data structures
  • Graph theory
  • High-performance computing, parallelism
  • DNA sequencing




  •  MAPPI project (ANR): Mapping and assemby of metagenomic and metatranscriptomic data, linked with the Tara Oceans expedition.
  •  Alcovna project (ARC): ALgorithms for COmparing and Visualizing Non Assembled data


[15] K. R. Bradnam et al. (incl R. Chikhi), Assemblathon 2: evaluating de novo methods of genome assembly in three vertebrate species, Giga Science (2013) [PDF]

[14] Chikhi R., Medvedev P. Informed and Automated k-Mer Size Selection for Genome Assembly, Bioinformatics (2013), Proceedings of HiTSeq 2013, Best Paper Award [PDF] [Webpage]

[13] G. Rizk, D. Lavenier, R. Chikhi. DSK: k-mer counting with very low memory usage, Bioinformatics (2013) [PDF] [Webpage]



[12] N. Maillet, C. Lemaitre, R. Chikhi, D. Lavenier, P. Peterlongo. Compareads: comparing huge metagenomic experiments, RECOMB Comparative Genomics (2012) [PDF] [Webpage]

[11] R. Chikhi, G. Rizk. Space-efficient and exact de Bruijn graph representation based on a Bloom filter, WABI (2012) [PDF] [Webpage]

[10] P. Peterlongo, R. Chikhi. Mapsembler, targeted and micro assembly of large NGS datasets on a desktop computer, to appear in BMC Bioinformatics (2012) [PDF] [Webpage]


[9] G. Sacomoto, J. Kielbassa, R. Chikhi, R. Uricaru, P. Antoniou, M-F. Sagot, P. Peterlongo and V. Lacroix, KisSplice: de-novo calling  alternative splicing events from RNA-seq data, in the proceedings of RECOMB-seq, BMC Bioinformatics (2012) [PDF] [Webpage]

[8] D. A. Earl et al. (incl R. Chikhi), Assemblathon 1: A competitive assessment of de novo short read assembly methods, Genome Research (2011) [PDF]

[7] G. Chapuis, R. Chikhi, D. Lavenier. Parallel and memory-efficient reads indexing for genome assembly,PPAM Parallel Bio-Computing Workshop (2011) [PDF]



[6] R. Chikhi, D. Lavenier. Localized genome assembly from reads to scaffolds: practical traversal of the paired string graph, Algorithms in Bioinformatics, LNCS 6833 (2011) [PDF]


[5] R. Chikhi, L. Sael, & D. Kihara, Protein binding ligand prediction using moment-based methods., Protein function prediction for omis era, D. Kihara ed., Chapter 8,  pp. 145-163, Springer. (2011) [PDF]


[4] D. Kihara, L. Sael, R. Chikhi, & J. Esquivel-Rodriguez, Molecular surface representation using 3D Zernike descriptors for protein shape comparison and docking., Curr. Protein and Peptide Science, 12: 520-530. (2010) [PDF]



[3] R. Chikhi, L. Sael, D. Kihara. Real-time ligand binding pocket database search using local surface descriptors. Proteins: Structure, Function, and Bioinformatics, Volume 78 Issue 9, Pages 2007 - 2028. (2010) [PDF]


[2] R. Chikhi, D. Lavenier. Paired-end read length lower bounds for genome re-sequencing. (Meeting Abstract) BMC Bioinformatics, 10(Suppl 13):O2 (2009) [PDF]


[1] R. Chikhi, S. Derrien, A. Noumsi, P. Quinton. Combining flash memory and FPGAs to efficiently implement a massively parallel algorithm for content-based image retrieval. International Journal of Electronics, Volume 95, Number 7, pp. 621-635(15) (2008)  [PDF]






Informed and Automated k-Mer Size Selection for Genome Assembly, ISMB/HiTSeq, 2013. [PDF]

de novo assembly (introduction), Evomics Workshop on Genomics, 2013[PDF]

Space-efficient and exact de Bruijn graph representation based on a Bloom filter, WABI, 2012[PDF]

Computational methods for de novo assembly of NGS data, Thesis slides, 2012. [PDF]

Localized genome assembly from reads to scaffolds: practical traversal of the paired string graph , WABI, 2011. [PDF]

de novo assembly tools, Monument, Mapsembler, IBL, Lille, 2011. [PDF]

Paired-end read length lower bounds for genome re-sequencing, ISCB Student Council Symposium, 2009. [PDF]



R. Chikhi. Computational Methods for de novo Assembly of Next-Generation Genome
Sequencing Data
. PhD Thesis, 2008-2012.

Summary:  We discuss computational methods (theoretical models and algorithms) to perform the reconstruction (de novo assembly) of DNA sequences produced by high-throughput sequencers. This thesis introduces the following contributions:

  • quantification of the maximum theoretical genome coverage achievable by sequencing data (paired reads) (Chapter 2)
  • a set of computational problems that are related to paired assembly (Chapter 3)
  • two novel concepts for practical assembly: localized assembly and memory-efficient reads indexing (Chapter 4)
  • implementation details of a de novo assembly software package, the Monument assembler (Chapter 5)
  • an algorithm that reconstructs variants of a known sequence in Mapsembler (Chapter 6)


R. Chikhi. Study of Unentanglement in Quantum Computing. Manuscript, research internship at MIT, Spring 2008.  [PDF]

Summary: We investigate the conjecture that one cannot simulate QMA(2) protocols in QMA using a quantum operation called a disentangler. Our results show that, when exponential precision is required, this conjecture holds unless P = NP. Moreover, also in the exponential precision case, we show that one only needs a stronger hypothesis to prove the conjecture.


R.Chikhi. Protein surface descriptors for binding sites comparison and ligand prediction. Manuscript, research internship at Purdue University, Summer 2007. [PDF]

Summary:  We present a model for two dimensional ligand binding pockets representation and we apply it to pocket-pocket matching and binding ligand prediction.




Minia assembler

Whole genome de novo assembler with very low memory usage, described in [11].

Kmergenie: http://kmergenie.bx.psu.edu/

Automatic detection of the k-mer size for de novo assembly, described in [14].

DSK: http://minia.genouest.org/dsk

K-mer counting software, low-memory, low disk usage, supports large values of k, described in [13].

KisSplice: http://alcovna.genouest.org/kissplice/
Alternative splicing calling from 1, 2 or more NGS RNA-seq datasets, see reference [9].

Mapsembler: http://alcovna.genouest.org/mapsembler/

Targeted assembly on a desktop computer, see reference [10].

Monument: http://www.irisa.fr/symbiose/people/rchikhi/monument.html

Whole genome de novo assembler, described in [6] and [7] and [Phd Thesis].

Pocket-Surfer:  http://dragon.bio.purdue.edu/pocket-surfer/index.php

Protein ligand binding pocket type prediction using a database of known binding sites. See [3] for more details.

Paired reads repetitions: on github

Software package for computing the ratio of single and paired (as in paired NGS reads) exact repetitions within a genome. Useful for obtaining re-sequencing lower bounds inspired by [Whiteford 05]. See [2] and the corresponding talk for sample results and details.

de Bruijn graph construction: on github

Hash table-free implementation of the de Bruijn graph for a set of reads. Also includes a tool that computes the union of two de Bruijn graphs and the cartesian product of abundances, useful for construction a multi-dataset de Bruijn graph.



Symbiose Project Team - INRIA/Irisa © 2007 - 2008