Printer-friendly version

Current position : postdoctoral fellow at the Max Planck Institute (Göttingen).

Email: [ with firstname.lastname = clovis.galiez]

PhD thesis :

Structural fragments : comparison, predictability from sequence and application to identification of viral proteins.

Supervisors : François Coste and Jacques Nicolas

PhD defended on the 8th of december 2015 at Inria Rennes.




I'm broadly interested in formal methods and their application to biology. Up to now, I focused on the local relationship between primary and tertiary structure of proteins and its application to protein annotation using machine learning.

Thesis summary

My PhD thesis investigates the local characterization of protein families at both structural and sequential level. A formal framework is introduced to describe local relationships between the primary and tertiary structure of proteins. Building on this formalism, we introduce contact fragments (CF) as portions of protein structure that conciliate spatial locality together with sequential neighborhood. We show that the predictability of CF from the sequence is better than that of contiguous fragments and of structurally distant pairs of fragments. In order to structurally compare CF, we introduce ASD, a novel alignment-free dissimilarity based on Fourier transform of the matrix of inter-atomic distances. This measure respects triangular inequality while being tolerant to sequence shifts and indels. We show that ASD can be used for standard fragments comparison and outperforms classical scores on practical experiments such that unsupervised classification and structural mining. Ultimately, by integrating the identification of CF from the sequence into a statistical machine learning framework, we developed VIRALpro, a tool that enables the detection of sequences of viral structural proteins.

Papers in journal

  • VIRALpro: a new suite for identifying viral capsid and tail sequences, C. Galiez, C. Magnan, F. Coste, P. Baldi (Bioinformatics) pdf.

  • Amplitude Spectrum Distance: measuring the global shape divergence of protein fragments. C. Galiez, F. Coste. (BMC Bioinformatics, 2015) pdf.

Talks/Posters in International conferences

  • 2015 : Structural conservation of remote homologues: better and further in contact fragments, poster at ISMB/ECCB 2015 Satellite Meeting - 3DSIG: Structural Bioinformatics and Computational Biophysics pdf.

  • 2014 : Identifying distant homologous viral sequences in metagenomes using protein structure information. Workshop on Recent Computational Advances in Metagenomics ECCB'14 hal pdf.


Amplitude Spectrum Distance : ASD a new dissimilarity measure between protein fragments. Based on the Fourier transform, ASD performs an alignment-free global shape comparison which is tolerant to small insertions/deletions and satisfies the triangle inequality. This unique combination of properties endows ASD with meaningful dissimilarity scores not only for almost identical fragments (as in the state-of-the-art scores) but also for more divergent fragments.

VIRALpro : a web service to detect capsid and tail proteins in peptidic sequences. You can download VIRALpro for off-line use here. The supplementary material can be found here.



  • 2014-2015 (Université de Rennes 1) :
    • TD & TP Compilation : Master 1 Informatique 48h
    • TP Bureautique : Licence 1 16h
  • 2013-2014 (Université de Rennes 1) : TD & TP Compilation : Master 1 Informatique 32h
  • 2009-2010 (Ecole Polytechnique Fédérame de Lausanne) :
    • Tutorials for the geometry course (First year of EPFL students in micro-electronics) 28h
    • Practicals for the course in C++ programming (Second year of EPFL students in biology) 28h

Visiting positions