the author's ugly face         
Research topics
Ph. D. students
Short bio

Research topics

From a general point of view, my research activities focus on the analysis of multimedia documents with the constant preoccupation of proposing (stochastic) models to combine all the sources of knowledge available. This general philosophy translates in two main areas:

  1. Spoken document analysis: detecting and tracking audio events in videos; speaker segmentation and tracking; speech recognition; topic segmentation; spoken document indexing. I am currently interested in the following topics:
  2. Multimedia stream modeling: joint models of multimedia streams for video analysis. The aim of this research is to devise models that can integrate the audio, visual and eventually textual information and represent their relations (temporal synchronisation model, correlations, etc.) for the analysis and structuring of videos and for audiovisual ASR. Current activities include:

Recent participation in projects (contribution to the project)

I am currently involved in the following projects

Over the last few years, I have participated to the following projects
Participation in the activities of the MUSCLE European Network of Excellence.

Ph. D. students

Ongoing Ph. D. thesis I am supervising:

Past Ph. D. students:

More Ph. D. in which I have been or I am involved in (but not supervising in any way):

Software development

I am actively participating in the development of the following free software toolkits:

These toolkits are the base (with a little help from HTK) of the IRENE broadcast news indexing platform , orginally developped for the French Ester rich transcription evaluation campaign in collaboration with François Yvon. Also check out my free ESTER resources page.

In the framework of the ASR/NLP work group I am coanimating, we have developed several pieces of code related to spoken document analysis. Among others, worth mentioning are:

These toolkits are not open-source freely distributed softwares but we are nevertheless willing to share. Feel free to contact me should you be interested in any of those.

Selected recent publications

Check out my complete list of publications.

Short bio

I obtained a master degree in Applied Mathematics at the Institut National des Sciences Appliquees (INSA Rouen) in 1995 and worked on speech synthesis at ELAN Informatique from 1996 to 1997. I received a Ph. D. in Signal and Image Processing (Toward speech modeling with Markov random fields) at the Ecole National Superieure des Telecommunications (ENST Paris) in 2000. After a one year post-doctoral stay at Irisa, I joined the Audio Visual Speech Technology group at IBM T. J. Watson research center from 2001 to 2002. Since 2002, I am a research fellow at the Centre National pour la Recherche Scientifique (CNRS), working at the Institut de Recherche en Informatique et Systèmes Aléatoires (IRISA). I received the Habilitation à Diriger des Recherches (HDR) de l'Université de Rennes 1, spécialité Informatique, in 2009.

Guillaume Gravier, Irisa, Campus de Beaulieu, 35042 Rennes Cedex, France.
Tel : +33 2 99 84 72 39 / Fax : +33 2 99 84 71 71