Today, according to Global Market Insights, the orthopedic medical device (MD) market is growing rapidly and will be worth more than $22.4 billion by 2025. Joint replacement (hip, knee, extremities) represents nearly 37% of the market share. These devices include conventional ancillary instruments, custom-made guides, navigation systems, and robotic systems. More recently, augmented reality (AR) navigation systems have been developed. They are recognized for their accuracy, low cost, ease of use, as well as clinical added value. It is in this context that the ANR MARSurg project [2021-2025] aims to implement an innovative surgical navigation solution with high scientific, technological and clinical potentials.
This thesis will be done at Inria and Irisa in Rennes in collaboration with ISIR (Sorbonne Université) in the scope of this ANR MARSurg project. The goal will be to improve the state-of-the-art regarding accurate and robust localization, pose estimation and visual tracking of markerless 3D objects using RGB-D images. These topics are very relevant in various applications, including industry (e.g., objects handling and grasping) and automotive vehicles (e.g., localization, navigation, …) but also in computer-assisted medical interventions (CAMI).
Tracking and pose estimation are very important research subjects in a real-time augmented reality context. The main requirements for trackers suitable for AR systems are high accuracy, robustness and little latency. The tracking of objects in the scene amounts to calculating the 3D pose between the camera and the objects. Virtual objects can then be projected into the scene using the pose. The objective of this thesis is to develop robust methods for detection, localization and tracking of objects (without markers) in RGB-D image sequences. Using deep neural network-based approaches, we aim to detect, classify and initialize a pose computation process for surgical instruments present in the images (eg, [Rad 2017]). Then, model-based tracking and localization approaches using both contours and depth maps provided by the RGB-D camera will be proposed [Marchand, 2016, Trinh, 2018]. The complexity of the surgical instruments under consideration requires the development of GPU (Graphics Processing Unit) based approaches to ensure a fast and complete projection of the model into the images [Petit 2014]. As the camera is itself mobile, the position of the objects in a fixed reference frame (in which the anatomical landmarks will also be expressed) requires the localization of the camera w.r.t. the environment that will be done using Visual Inertial SLAM methods assisted by an IMU (Inertial Measurement Unit). Moreover, to deal with fast movements, the prediction of object position, integrating inertial data, will be managed thanks to particle filters on SE(3). To validate the system, an estimation of the measurement error will be performed by an external system giving the ground truth (either by mounting the camera on a robot or by using a Vicon 3D measurement system).
Methodological developments will be carried out using the C ++ ViSP software library (https://visp.inria.fr) and will be validated via experiments on real image sequences and mockup.
- Knowledge in computer vision and image processing, vSLAM, machine learning
- Mathematics, optimization, linear algebra
- Excellent programming skills in C ++
The thesis will be conducted in the Rainbow team at Inria and IRISA. The laboratory is located on the university campus of Beaulieu in Rennes. Occasional trips to Paris will be made.
[Marchand, 2016] E. Marchand, H. Uchiyama, F. Spindler: Pose estimation for augmented reality: a hands-on survey. IEEE TVCP, 2016.
[Petit, 2014] A. Petit, E. Marchand, A. Kanani: Combining complementary edge, keypoints and color features in model-based tracking for highly dynamic scenes, IEEE ICRA, p. 4115-4120, 2014.
[Trinh, 2018] S. Trinh, F. Spindler, E. Marchand, F. Chaumette: A modular framework for model-based visual tracking using edge, texture and depth features. IEEE/RSJ IROS'18, p. 89-96, 2018.
[Rad, 2017] Rad M. et al. BB8 A scalable, accurate, robust to partial occlusion method for predicting the 3D poses of challenging objects without using depth. IEEE ICCV, p. 3828-383, 2017