Model-free augmented reality by virtual visual servoing

Contact : Eric Marchand , Muriel Pressigout

Creation Date : April 2004

Note about this page

This page has been created in 2004.

Overview of the approach

This demo addresses the problem of real-time camera motion estimation from monocular image sequences using virtual visual servoing technics. We have applied this intuitive method to achieve realistic and real-time video sequences augmentation with very few 3D knowledge.

We exploit the underlying geometrical constraints existing between two successive images to estimate camera motion. This allows us to formulate the non linear minimization needed in the process like a virtual visual servoing problem. To achieve better performance, we consider a robust estimation of the camera motion. This work can be used for augmented reality applications by combining the motion estimation along the video sequence with the first image pose.

Description of the approach

Motion computation is formulated in terms of a full scale non-linear optimization within the Virtual Visual Servoing approach(VVS) [Marchand02c]. In this way, motion estimation problem is considered as similar to 2D virtual visual servoing. 2D virtual visual servoing or image-based camera control allows control of a virtual eye-in-hand camera wrt. its environment. More precisely it consists in specifying a task as the regulation in the image of a set of visual features. A closed-loop control law that minimizes the error between the current and desired position of these visual features can then be built which determines automatically the motion the camera has to realize.
For our problem, the geometrical constraints existing between two successive images gives us the visual features to be regulated by a virtual camera in order to estimate accuratly the real camera motion.
In this demonstration, we applied this framework and builds an image feature based system which is capable of estimating camera motion in real-time without the need for 3D knowledge. To apply this result to augmented reality, the first camera pose computation is needed, which require very little 3D knowledge.

Augmented reality results on various of video sequences

The complete implementation of the image augmentation, including tracking and motion estimation, was carried out on video sequences acquired by different CCD cameras whose calibration was sometimes not accurate. In such experiments, image processing may be very complex. Indeed extracting and tracking reliable points in real environment is a non trivial issue. For the realistic sequences, we used Tomasi-Kanade tracker. Tracking, motion estimation and image augmentation are performed at near frame rate.

To validate the approach, we use different cameras in different environments. Therefore we apply our method on realistic indoor sequences as well as on realistic outdoor sequences. The challenge in these sequences is to perform an accurate motion estimation for augmentation since data are very noisy: there are point extraction noise, matching errors, wind in the trees, illumination changes, occlusions... Yet, in all the video sequences, the estimated motion is accurate enough to perform good augmented videos: indeed, it is important that the position of the augmented object remains stable at the same relative location along the sequence. In all the figures depicted, in addition of the first, intermediate and last images display, the augmented video sequence is available.

The first experiment (Figure 1) is a video sequence acquired outdoor. To estimate the motion, we track some points on the wall. The camera motion in complex but we assume that the points are located on a plane.

d
see the entire video sequence (1.08 Mo)

Figure 1: case of a planar structure by camera 1(a) the first image, (b) and (c) two intermediate images and (d) the last image

The second and the third experiment (Figure 2 and 3) also deal with video sequences acquired outdoor undergoing a pure rotation motion. Tracked points can be located anywhere in the images.

e
see the entire video sequence (4.10 Mo)

Figure 2: case of pure rotation, outdoor (a) the first image, (b), (c) and (d) three intermediate images and (e) the last image

e
see the entire video sequence (4.39 Mo)

Figure 3: case of pure rotation, outdoor (a) the first image, (b), (c), (d) and (e) three intermediate images

The forth experiment is a video sequence acquired indoor undergoing a pure rotion motion (Figure 4). Tracked points can be located anywhere in the images. Here, the challenge is to perform a correct motion estimation while the camera and the person in the scene are moving simultaneously.

g
see the entire video sequence (2.44 Mo)

Figure 4: case of pure rotation, indoor (a) the first image, (b-f) intermediate images and (g) the last image

Publications

M. Pressigout, E. Marchand. Model-free augmented reality by virtual visual servoing. In IAPR Int. Conf. on Pattern Recognition, ICPR'04, Cambridge, United Kingdom, August 2004.

Related Papers

A. Comport, E. Marchand, F. Chaumette. A real-time tracker for markerless augmented reality. In ACM/IEEE Int. Symp. on Mixed and Augmented Reality, ISMAR'03, Pages 36-45, Tokyo, Japon, Octobre 2003.

E. Marchand, F. Chaumette. Virtual Visual Servoing: a framework for real-time augmented reality. in EUROGRAPHICS 2002 Conference Proceeding, G. Drettakis, H.-P. Seidel (eds.), Computer Graphics Forum, Volume 21(3), Saarebrücken, Germany, September 2002.

V. Sundareswaran, R. Behringer. Visual servoing-based augmented reality. In IEEE Int. Workshop on Augmented Reality, San Francisco, November 1998.