3D RECONSTRUCTION OF URBAN AREAS - DATA REGISTRATION

3D reconstruction Registration of ground videos with rough GIS-based 3D city models

G. Sourimant,

contact: G. Sourimant, L. Morin, T. Colleu

Context and Goal

3D reconstruction of urban environments is a widely studied subject since several years, since it can lead to many useful applications: virtual navigation, augmented reality, architectural planification, etc. One of the most difficult problem nowadays in this context is the acquisition and treatment of very large scale data if precise reconstruction is aimed.
We present here results of our system for computing geo-referenced positions and orientations of images of buildings from non calibrated videos. Providing such information is a mandatory step to well conditioned large scale and precise 3D reconstruction of urban areas. Our method is based on the registration of multimodal datasets, namely GPS measures, video sequences and rough 3D models of buildings extracted from a GIS database.

Approach

So as to refine buildings 3D models extracted from a GIS database thanks to image sequences, one has to be able to determine which part of the models correspond to which part of the images in the video. One solution to this problem consists in registering the models projection with the corresponding images of the buildings. But to compute such a projection, one has to estimate both the intrinsic parameters (focal distance, principal point, etc.) and the extrinsic parameters (position and orientation) of the camera, for each image of the video.
We suppose here that the camera is not calibrated, and restrict the estimation to the use of the intrinsic parameters provided by the camera constructor. We thus have to estimate the camera extinsic parameters (or the camera pose), expressed in the same coordinate system than the GIS models, namely the UTM coordinate system.

Pose computation for each image of the video can be decomposed into three main steps (see Fig. 1):

  1. Position initilization - The positions of the cameras are initialized using the acquired GPS measures. Since GPS acquisition frequency if often far lower than video acquisition frequency, these measures are temporally interpolated.
  2. Pose computation for the first image - The second step consists in estimating the relative camera pose between the first image and a further key image. The camera translation is related to the measured GPS translation to estimate the rough first camera pose. Then, lines are extracted from the image and related robustly to lines extracted from the model to compute the pose more accurately using a virtual visual servoing algorithm.
  3. Automatic pose tracking - One the pose of the camera has been initialized for the first image, it is tracked throughout the remaining images using 2D-3D point correspondances, thanks to a robust adaptation of the virtual visual servoing algorithm.

Registration method overview
Fig. 1 - Registration method overview.

Experimental Results

Ifsic Sequence

This sequence has been shot with a handheld video camera, with a generic motion. It is generic here in the sense where no specific building façade has been targetted in order to be reconstructed later. One can also notice that due to the lack of any stabilization device, the motion is very noisy, and as such, image-model registration is in general hard to perform for such video. We present here two registration results. In the first one, tagged as non robust, a generic visual virtual servoing algorithm has been used to perform the image-model registration. In the second one, tagged as robust, the same algorithm has been used, together with our specific improvements to this particular context of urban modeling.

Beaulieu Sequence

In this sequence - also shot with a handheld video camera - we can clearly see the robustness of our algorithm in case of presence of occluding objects in front of the building, or with specularities. The non robust version simply fails to track the pose because : it drifts so much it can't compute necesary 2D-3D points correspondances.

Ifsic tracking results
Fig. 2 - Ifsic non robust tracking results (left) and robust tracking results (right).

Beaulieu tracking results
Fig. 3 - Beaulieu non robust tracking results (left) and robust tracking results (right).

References

Webmaster: Valid CSS! Valid XHTML IRISA
Last time modified: 2008-05-14