3D RECONSTRUCTION OF URBAN AREAS - DATA REGISTRATION
 |
Registration of ground videos with rough GIS-based 3D city models |
G. Sourimant,
contact:
G. Sourimant,
L. Morin,
T. Colleu
Context and Goal
3D reconstruction of urban environments is a widely studied subject since several years, since it can lead to many
useful applications: virtual navigation, augmented reality, architectural planification, etc. One of the most difficult
problem nowadays in this context is the acquisition and treatment of very large scale data if precise reconstruction is
aimed.
We present here results of our system for computing geo-referenced positions and orientations of images of buildings
from non calibrated videos. Providing such information is a mandatory step to well conditioned large scale and precise
3D reconstruction of urban areas. Our method is based on the registration of multimodal datasets, namely GPS measures,
video sequences and rough 3D models of buildings extracted from a GIS database.
Approach
So as to refine buildings 3D models extracted from a GIS
database thanks to image sequences, one has to be able to determine which part of the models correspond to which part of
the images in the video. One solution to this problem consists in registering the models projection with the
corresponding images of the buildings. But to compute such a projection, one has to estimate both the intrinsic
parameters (focal distance, principal point, etc.) and the extrinsic parameters (position and orientation) of the
camera, for each image of the video.
We suppose here that the camera is not calibrated, and restrict the estimation to the use of the intrinsic parameters
provided by the camera constructor. We thus have to estimate the camera extinsic parameters (or the camera pose),
expressed in the same coordinate system than the GIS models,
namely the UTM coordinate system.
Pose computation for each image of the video can be decomposed into three main steps (see Fig. 1):
-
Position initilization - The positions of the cameras are initialized using the acquired GPS measures. Since
GPS acquisition frequency if often far lower than video acquisition frequency, these measures are
temporally interpolated.
-
Pose computation for the first image - The second step consists in estimating the relative camera
pose between the first image and a further key image. The camera translation is related to the measured
GPS translation to estimate the rough first camera
pose. Then, lines are extracted from the image and related robustly to lines extracted from the model to
compute the pose more accurately using a virtual visual servoing algorithm.
-
Automatic pose tracking - One the pose of the camera has been initialized for the first image, it is
tracked throughout the remaining images using 2D-3D point correspondances, thanks to a robust adaptation of
the virtual visual servoing algorithm.

Fig. 1 - Registration method overview.
Experimental Results
|
Ifsic Sequence
This sequence has been shot with a handheld video camera, with a generic motion. It is generic here in the sense
where no specific building façade has been targetted in order to be reconstructed later. One can also notice that
due to the lack of any stabilization device, the motion is very noisy, and as such, image-model registration is in
general hard to perform for such video.
We present here two registration results. In the first one, tagged as non robust, a generic visual virtual
servoing algorithm has been used to perform the image-model registration. In the second one, tagged as robust, the
same algorithm has been used, together with our specific improvements to this particular context of urban modeling.
|
|
Beaulieu Sequence
In this sequence - also shot with a handheld video camera - we can clearly see the robustness of our algorithm in
case of presence of occluding objects in front of the building, or with specularities. The non robust version simply
fails to track the pose because : it drifts so much it can't compute necesary 2D-3D points correspondances.
|
|
|

Fig. 2 - Ifsic non robust tracking results (left) and robust tracking results (right).
|
|

Fig. 3 - Beaulieu non robust tracking results (left) and robust tracking results (right).
|
|
References
-
Thomas Colleu, Gaël Sourimant and
Luce Morin,
Automatic Initialization for the Registration of GIS and Video Data,
3DTV 2008, Istambul, Turkey, May, 2008.
[PDF]
-
Gaël Sourimant,
Luce Morin and
Kadi Bouatouch,
Gps, Gis and Video Registration for Building Reconstruction,
ICIP 2007, 14th IEEE International Conference on Image Processing, San Antonio, Texas, USA, September, 2007.
[PDF]