Robust model-based tracking for visual servoing application

Contact : Eric Marchand, François Chaumette, Andrew Comport

Creation Date : December 2003

Overview of the approach

This demo addresses the problem of real-time model-based tracking of 3D objects in monocular image sequences. This fundamental vision problem has applications in many domains ranging from Augmented Reality to Visual Servoing and even Medical Imaging or Industrial applications.

A markerless model-based algorithm is used for the tracking of 3D objects in monocular image sequences. The main advantage of a model based method is that the knowledge about the scene (the implicit 3D information) allows improvement of robustness and performance by being able to predict hidden movement of the object and acts to reduce the effects of outlier data introduced in the tracking process.

Description of the approach

In this paper, pose computation is formulated in terms of a full scale non-linear optimization: Virtual Visual Servoing (VVS) [Marchand02c]. In this way the pose computation problem is considered as similar to 2D visual servoing. 2D visual servoing or image-based camera control allows control of a eye-in-hand camera wrt. its environment. More precisely it consists in specifying a task (mainly positioning or target tracking tasks) as the regulation in the image of a set of visual features. A set of constraints are defined in the image space. A closed-loop control law that minimizes the error between the current and desired position of these visual features can then be built which determines automatically the motion the camera has to realize. This paper takes this framework and builds an image feature based system which is capable of treating complex scenes in real-time without the need for markers.

Tracking results in visual servoing experiments

Any visual servoing control law can be used using the output of our tracker (image-based, position-based or hybrid scheme). In the presented experiments we have considered a now well known approach, already described in [Malis99a]. It consists in combining visual features obtained directly from the image, and features expressed in the Euclidean space. 2D 1/2 visual servoing consists in combining image features and 3D data. The 3D information can be retrieved either by a pose estimation algorithm, either by a projective reconstruction, obtained from the current and desired images. In our context since the pose is an output of our tracker we will consider the former solution.

The complete implementation of the robust visual servoing task, including tracking and control, was carried out on an experimental test-bed involving a CCD camera mounted on the end effector of a six d.o.f robot. Images were acquired and processed at video rate (50Hz).

In such experiments, the image processing is potentially very complex. Indeed extracting and tracking reliable points in real environment is a non trivial issue. The use of more complex features such as the distance to the projection of 3D circles, lines, and cylinders has been demonstrated in [Comport03c] in an augmented reality context. In all experiments, the distances are computed using the Moving Edges algorithm previously described. Tracking is always performed at below frame rate (usually in less than 10ms).

a b c d

Figure 1: Tracking in complex environment within a classical visual servoing experiments: Images are acquired and processed at video rate (25Hz). Blue: desired position defined by the user. Green: position measured after pose calculation. (a) first image initialized by hand, (b) partial occlusion with hand, (c) lighting variation, (d) final image with various occlusions

In all the figures depicted, current position of the tracked object appears in green while its desired position appears in blue. Three objects where considered: a micro-controller (Figure 1), an industrial emergency switch (Figure 2) and a video multiplexer (Figure 3).

To validate the robustness of the algorithm, the objects were placed in a highly textured environment as shown in Fig. 1, 2 and 3. Tracking and positioning tasks were correctly achieved. Multiple temporary and partial occlusions by an hand and various work-tools as well as modification of the lighting conditions were imposed during the realization of the positioning task. On the third experiments (Figure 3) after a complex positioning task (note that some object faces appeared while other disappeared) the object is handled by hand and moved around. Since the visual servoing task has not been stopped, robot is still moving in order to maintain the rigid link between the camera and the object.

For the second experiment, plots are also shown which help to analyse the pose parameters, the camera velocity and the error vector. In the second experiment the robot velocity is reached 23 cm/s in translation and 85 deg/s in rotation. In other words, less than 35 frames were acquired during the entire positioning task up until convergence (see Figure 2e). Therefore the task was accomplished in less than 1 second ! In all these experiments, neither a Kalman filter (or other prediction process) nor the camera displacement were used to help the tracking.

a b c d d

Figure 2: 2D 1/2 visual servoing experiments: on these five snapshots the tracked object appears in green and its desired position in the image in blue. Plots correspond to (a) Pose (translation) (b) Pose (rotation) (c-d) camera velocity in rotation and translation (e) error vector s-s*

Figure 3: 2D 1/2 visual servoing experiments: on these snapshots the tracked object appears in green and its desired position in the image in blue. The six first images have been acquired in initial visual servoing step. In the reminder images object is moving along with the robot.


A. Comport, E. Marchand, F. Chaumette. A real-time tracker for markerless augmented reality. In ACM/IEEE Int. Symp. on Mixed and Augmented Reality, ISMAR'03, Pages 36-45, Tokyo, Japon, Octobre 2003.

E. Marchand, F. Chaumette. Virtual Visual Servoing: a framework for real-time augmented reality. in EUROGRAPHICS 2002 Conference Proceeding, G. Drettakis, H.-P. Seidel (eds.), Computer Graphics Forum, Volume 21(3), Saarebrücken, Germany, September 2002.

Other Related Papers

V. Sundareswaran, R. Behringer. Visual servoing-based augmented reality. In IEEE Int. Workshop on Augmented Reality, San Francisco, November 1998.

B. Espiau, F. Chaumette, P. Rives. A new approach to visual servoing in robotics. IEEE Trans. on Robotics and Automation, 8(3):313-326, June 1992.

S. Hutchinson, G. Hager, P. Corke. A tutorial on visual servo control. IEEE Trans. on Robotics and Automation, 12(5):651-670, October 1996.

| Lagadic | Map | Team | Publications | Demonstrations |
Irisa - Inria - Copyright 2009 © Lagadic Project