dc.description.abstract | 3D model reconstruction techniques using RGB-D information have been gaining a great attention of the researchers around the world in recent decades. RGB-D sensor is consisted of a color camera, infrared (IR) emitter and receiver. Hence, a RGB-D sensor can capture both a color image and depth image of a scene. Depth image provides the distance to each point in scene. RGB-D sensors are widely used in many research fields, such as in computer vision, computer graphics, and human computer interaction, due to its capacity of providing color and depth information.
This dissertation presents research findings on calibrating information captured from a network of RGB-D sensors in order to reconstruct a 3D model of an object. We used a network of RGB-D sensors, which are interconnected in a network. The reason to use a network of sensors was to capture live gestures of a human as we cannot capture gestures from all views around the human using a single sensor. High bit rate streams captured by each sensor are first collected at a centralized PC for the processing. This even can be extended to a remote PC in the Internet.
Point clouds are then generated from the RGB-D information. Point clouds are a set of scattered 3D points which represents the surface structure and the color of the object captured. Multiple point clouds generated from multiple sensors are then aligned with each other to create a 3D object. A modified version of the Iterative Closest Point (ICP) algorithm is introduced for this purpose.
Captured point clouds may contain noise due to several reasons such as, inherent camera distortions, interference from infrared field of other sensors, and inaccurate infrared reflection due to object surface properties. Two noise removal algorithms are introduced to get rid of such noise in the point clouds namely adaptive distance-based noise removal and adaptive density-based noise removal algorithm. Adaptive distance-based noise removal is performed before the alignment of the point clouds and the adaptive density-based noise removal is performed after the alignment.
Resolution of the color image is much higher than the depth image of most RGB-D sensors. Point clouds are generated using the depth information and hence, number of points in the point clouds depends on the resolution of the depth image. A new algorithm for 3D super resolution of the point clouds is introduced in order to increase the number of points in the point clouds using the advantage of the higher resolution color image.
Point clouds may contain small and large holes in the surface. Small holes are first located and three small hole filling mechanisms are introduced. As the camera does not capture not facing surfaces to the camera and the surfaces behind another object which is referred as occlusion, large holes are created. A 3D inpainting algorithm based on 2D inpainting is proposed to fill the large holes in the point clouds. Finally, a surface is reconstructed using the point clouds clearly representing the captured 3D object.
Two experiments were performed. In the first experiment 8 Microsoft Kinect version 2 sensors were used as the RGB-D sensor and human busts were captured and reconstructed. In the second experiment one Intel Realsense SR300 sensor was used as the RGB-D sensor to capture and reconstruct the surface of a type of puppets in Taiwan called Budaixi. Experimental results demonstrate that the proposed methods generate a better 3D model of the object captured.
| en_US |