ICRA 2016 Tutorial

ICRA 2016 Tutorial on

Vision for robotics

Stockholm, Sweden

Friday, May 20, 2016, 8h - 12h30, Room A1


François Chaumette
Peter Corke
Jana Kosecka
Eric Marchand
Université de Rennes 1

Tutorial Content

As for humans and most animals, vision is a crucial sense for a robot to interact within its environment. Vision for robotics has given rise to an incredible amount of research and successful applications from the creation of the fields of robotics and computer vision several decades ago.
The aim of this tutorial is to provide a comprehensive state of the art on the basic concepts, methodologies and applications. It will thus be devoted to the modeling of visual sensors and underlying geometry, object detection and recognition, visual tracking and 3D localization, and visual servoing, closing the loop from perception to action.
Note that visual SLAM, which is an important component of vision for robotics, for exploration and navigation typically, will not be addressed in this tutorial but in the afternoon tutorial devoted to SLAM. The interested audience is invited to follow these two tutorials to get a global overview of robot vision.

The tutorial consists in a set of four lectures:

08:00 - 8:05


08:05 - 9:05

Lecture 1 : Visual sensors and geometry

Peter Corke, QUT

09:05 - 10:05

Lecture 2 : Object detection and recognition

Jana Kosecka, GMU

10:05 - 10:20

Lecture 3 : Visual tracking

Eric Marchand, Université de Rennes 1


10:40 - 11:25

Lecture 3 : Visual tracking (foll.)

Eric Marchand, Université de Rennes 1

11:25 - 12:25

Lecture 4 : Visual servoing

François Chaumette, Inria

12:25 - 12:30


Lecture 1 : Visual sensors and geometry

For any vision-based robotic system the first step is to acquire an image and this talk will cover some important aspects of this process. We start by considering the nature of light, color and intensity, and how that is transduced by a camera and touch briefly on color spaces and color-based segmentation.
We then look in more detail at the details of cameras touching on exposure, motion blur, saturation and the constraints of rolling shutter cameras.
Finally we look at the geometry of image formation with the pin-hole camera model, the need for lenses, image distortion and if time permits touch on panoramic or very wide angle cameras.

Lecture 2 : Object detection and recognition

The capability to detect and recognize objects in cluttered dynamically changing environments is one key component of robot perceptual systems. This capability is used for high-level service robotics tasks (e.g. fetch and delivery of objects, object manipulation, and object search). We will overview basic formulations including (1) local descriptor approaches capturing appearance or geometry statistics and shape; (2) sliding window techniques, descriptors, and associated efficient search strategies; (3) object proposal methods that start with bottom-up segmentation followed by evaluation of classifiers. We will discuss the design choices made in each of these formulations with a focus on efficiency and the ability to handle clutter and occlusion.

Lecture 3 : Visual Tracking

Visual tracking is a key issue in the development of vision-based robotics tasks. Once detected and recognized, objects have to be tracked and localized in the images stream.
Beginning with the tracking of elementary geometrical features (points, lines,...), we will consider the case where the model of tracked objects are fully known (model-based tracking) along with the case where less information but image intensity and basic geometrical constraints are available (template tracking or KLT-like method).
Tracking being a spatio-temporal process, prediction and filtering (e.g., Kalman/particle filters) are useful process for improving visual tracking results and robustness.
The results of the tracking algorithms may then be considered within a visual servoing control scheme.

Lecture 4 : Visual Servoing

Visual servoing consists in controlling the motions of a dynamic system in closed loop with respect to visual data. The talk will describe the different modeling steps necessary to design a vision-based control scheme and a panel of applications showing the large class of robotics tasks that can be accomplished using this approach.


Reference documents
  • P. Corke. Robotics, Vision and Control: Fundamental Algorithms in Matlab, Springer, 2011.

  • Y. Ma, S. Soatto, J. Kosecka, S. Sastry. An invitation to 3-d vision: from images to geometric models, Springer, 2012.

  • R. Szeliski, Computer Vision: Algorithms and Applications, Springer 2011

  • J. Ponce, M. Hebert, C. Schmid, A. Zisserman (Eds). Toward Category-Level Object Recognition, Springer, 2007

  • E. Marchand, H. Uchiyama, F. Spindler. Pose estimation for augmented reality: a hands-on survey. IEEE Trans. on Visualization and Computer Graphics, 2016.

  • F. Chaumette, S. Hutchinson. Visual servoing and visual tracking. In Handbook of Robotics, B. Siciliano, O. Khatib (eds.), Chap. 24, pp. 563-583, Springer, 2008.