Vous êtes ici

Exploring and Learning from Visual Data

This talk is about the journey of computer vision and machine learning research over a period of more than 10 years, before and after the establishment of the establishment of deep learning as the dominant paradigm in representing and understanding visual data. The journey of the author's own research and contributions, designing representations and matching processes to explore visual data and exploring visual data to learn better representations.

Part 1 addresses instance-level visual search and clustering, building on shallow visual representations and matching processes. The representation is obtained by a pipeline of local features, hand-crafted descriptors and visual vocabularies. Improvements in the pipeline are introduced, including the construction of large scale vocabularies, spatial matching for geometry verification, representations beyond vocabularies and nearest neighbor search. Applications to exploring photo collections are discussed, including location and landmark recognition.

Part 2 addresses instance-level visual search and object discovery, building on deep visual representations and matching processes, focusing on the manifold structure of the feature space. The representation is obtained by deep parametric models learned from visual data. Contributions are made to advancing manifold search over global or regional representations obtained by convolutional neural networks (CNN). Spatial matching is revisited with local features detected on CNN activations. Object discovery from CNN activations over an unlabeled image collection is introduced.

Part 3 addresses learning deep visual representations by exploring visual data, focusing on limited or no supervision. It progresses from instance-level to category-level tasks and studies the sensitivity of models to their input. It introduces methods using limited supervision, including unsupervised metric learning, semi-supervised learning and a few-shot learning. The latter studies activation maps for the first time. An attack is introduced as an attempt to improve upon the visual quality of adversarial examples in terms of imperceptibility.

Yannis Avrithis - Linkmedia
Vendredi, 3. juillet 2020 - 10:00
Salle Métivier
Type soutenance: 
Composition du Jury: 

1. Patrick PÉREZ, Scientific Director, valeo.ai
2. Horst BISCHOF, Professor, TU Graz
3. Gabriela CSURKA KHEDARI, Senior Scientist, Naver Labs
4. Rémi GRIBONVAL, Research Director, Inria
5. Jiri MATAS, Professor, CTU Prague
6. Nikos PARAGIOS, Professor, CentraleSupélec
7. Cordelia SCHMID, Research Director, Inria
8. Eric MARCHAND, Professor, University of Rennes 1