Jonathan Delhumeau

I am a software engineer in the Inria Texmex team headed by Patrick Gros. My work is focused on large scale multimedia indexation, retrieval and classification (for audio, video and still images). I have been working mostly with Hervé Jégou and Sébastien Campion but also with several Texmex PhD students and post-docs. I set-up and run experiments for research and for participation in international evaluations and have been the lead programmer in the development of high-level frameworks for multimedia indexation.

Short Bio

I graduated from Ifsic of University of Rennes I in 2002 with a Masters in research and an Engineer's diploma, with an academic focus on image and video processing and a Master's thesis on 3D reconstruction from video. After two years working in research on video watermarking as a PhD student at Inria Rennes, I left for the UK where I worked in graphics drivers verification and validation for Imagination Technologies. I came back to France in 2007 and worked for an IT consulting company as a software developer. In this context, I worked for several companies such as Orange labs and Canon-CRF until I joined the Inria Texmex team in 2011.

Software

Peyote: a flexible framework for large scale multimedia indexation

Peyote is a framework for Video and Image description, indexation and nearest neighbour search. It can be used as-is by a video-search or image-search front-end with the implemented descriptors and search modules. It can also be used via scripting for large-scale experimentation. Finally, thanks to its modularity, it can be used for scientific experimentation on new descriptors or indexation methods. Peyote has already been used in research, in an internal-use annotation tool to help find commercial instances in large TV databases (6 month long) and to compute descriptors for evaluation campaigns (Trecvid and Mediaeval).

Babaz: audio indexing system for video copy detection

Babaz is an audio search system for robust copy detection. It is available under GNU GPL here. It has been used in the Trecvid 2011 Copy Detection task and is described in details in our ICASSP 2012 paper.

International evaluation campaigns

Mediaeval 2012 Placing task

The aim of the placing task is to localize flickr videos on a world map using the visual content and the metadata available on the page (tags, user details, etc.). Our complete system showed the best results among all participants using no external data although my contribution was mostly on the visual content sub-component (which ranked second). A brief overview of our complete system for this task is in the workshop paper.

Trecvid 2012 Semantic Indexing task

The aim of the semantic indexing task is for each query concept to return lists of videos from the test-set most likely to contain it. We participated in the IRIM group submission by computing descriptors based on dense SIFTs, aggregated into Bag of Features and VLADs.

Trecvid 2011 Content-based Copy Detection Task

The aim of the content-based copy detection task is to identify “attacked” video segments extracted from a large database in the query videos. Our combined audio-video system was ranked third. However, the audio subsystem on which my work was focussed was ranked first among the audio-only submissions.

Publications

Revisiting the VLAD image representation

Jonathan Delhumeau, Philippe-Henri Gosselin, Hervé Jégou and Patrick Pérez,
Proc. ACM Multimedia‘13, October 2013, Barcelona, Spain.

Efficient Supervised Dimensionality Reduction for Image Categorization

Rachid Benmokhtar, Jonathan Delhumeau and Philippe-Henri Gosselin,
Proc. ICASSP‘13, May 2013, Vancouver, Canada.

Retrieving geo-location of videos with a divide & conquer hierarchical multimodal approach

Michele Trevisiol, Hervé Jégou, Jonathan Delhumeau and Guillaume Gravier,
Proc. ICMR‘13, April 2013, Dallas, USA.

How INRIA/IRISA identifies Geographic Location of a Video

Michele Trevisiol, Jonathan Delhumeau, Hervé Jégou, Guillaume Gravier,
Working Notes Proceedings of the MediaEval 2012 Workshop.

BABAZ: a large scale audio search system for video copy detection

Hervé Jégou, Jonathan Delhumeau, Jiangbo Yuan, Guillaume Gravier and Patrick Gros,
Proc. ICASSP‘12, March 2012, Japan.

INRIA@TRECVID'2011: Copy Detection & Multimedia Event Detection

Ayari M., Delhumeau J., Douze M., Jégou H., Potapov D., Revaud J., Schmid C., Yuan J,
TRECVID (2011).

Contact

Address:
INRIA, Campus Universitaire de Beaulieu
35042 Rennes Cedex
FRANCE