Jonathan Delhumeau

I am a software engineer in the Inria Texmex team headed by Patrick Gros. My work is focused on large scale multimedia indexation, retrieval and classification (for audio, video and still images). I have been working mostly with Hervé Jégou and Sébastien Campion but also with several Texmex PhD students and post-docs. I set-up and run experiments for research and for participation in international evaluations and have been the lead programmer in the development of high-level frameworks for multimedia indexation.

Short Bio

I graduated from Ifsic of University of Rennes I in 2002 with a Masters in research and an Engineer's diploma, with an academic focus on image and video processing and a Master's thesis on 3D reconstruction from video. After two years working in research on video watermarking as a PhD student at Inria Rennes, I left for the UK where I worked in graphics drivers verification and validation for Imagination Technologies. I came back to France in 2007 and worked for an IT consulting company as a software developer. In this context, I worked for several companies such as Orange labs and Canon-CRF until I joined the Inria Texmex team in 2011.


Peyote: a flexible framework for large scale multimedia indexation

Peyote is a framework for Video and Image description, indexation and nearest neighbour search. It can be used as-is by a video-search or image-search front-end with the implemented descriptors and search modules. It can also be used via scripting for large-scale experimentation. Finally, thanks to its modularity, it can be used for scientific experimentation on new descriptors or indexation methods. Peyote has already been used in research, in an internal-use annotation tool to help find commercial instances in large TV databases (6 month long) and to compute descriptors for evaluation campaigns (Trecvid and Mediaeval).

Babaz: audio indexing system for video copy detection

Babaz is an audio search system for robust copy detection. It is available under GNU GPL here. It has been used in the Trecvid 2011 Copy Detection task and is described in details in our ICASSP 2012 paper.

International evaluation campaigns

Mediaeval 2012 Placing task

The aim of the placing task is to localize flickr videos on a world map using the visual content and the metadata available on the page (tags, user details, etc.). Our complete system showed the best results among all participants using no external data although my contribution was mostly on the visual content sub-component (which ranked second). A brief overview of our complete system for this task is in the workshop paper.

Trecvid 2012 Semantic Indexing task

The aim of the semantic indexing task is for each query concept to return lists of videos from the test-set most likely to contain it. We participated in the IRIM group submission by computing descriptors based on dense SIFTs, aggregated into Bag of Features and VLADs.

Trecvid 2011 Content-based Copy Detection Task

The aim of the content-based copy detection task is to identify “attacked” video segments extracted from a large database in the query videos. Our combined audio-video system was ranked third. However, the audio subsystem on which my work was focussed was ranked first among the audio-only submissions.


Revisiting the VLAD image representation

Efficient Supervised Dimensionality Reduction for Image Categorization

Retrieving geo-location of videos with a divide & conquer hierarchical multimodal approach

How INRIA/IRISA identifies Geographic Location of a Video

BABAZ: a large scale audio search system for video copy detection

INRIA@TRECVID'2011: Copy Detection & Multimedia Event Detection

Older publications

Capacity of data-hiding system subject to desynchronisation

Construction de codes pour tatouage avec prise en compte de l'information adjacente

Improved Polynomial Detectors for Side-Informed Watermarking

Segmentation basée mouvement 3D pour la détection d'objets indépendants

INRIA, Campus Universitaire de Beaulieu
35042 Rennes Cedex