McGill University School of Computer Science: CS302C

SENSOR FUSION FOR 3-D MODELING OF INDOOR ENVIRONMENTS
L. Abril Torres-Mendez, PhD Student
School of Computer Science and Centre for Intelligent Machines
McGill University, Montreal, Quebec, H3A 2A7.

Most existing approaches for modeling an environment are based on a single sensor. However, depending on the application and the complexity of the environment, achieving geometric correctness and realism may require a large number of images from several sensor devices of same and/or different nature. This in turn poses the problem of sensor fusion: how to interpret, combine and integrate sensory information in order to construct a proper representation of the environment. In our work, we address one of the major goals of mobile robot research, the creation of a 3-D model from local sensor data collected as the robot moves around an unknown indoor environment. With knowledge about its environment, an autonomous mobile robot can reliably achieve exploration and navigation tasks. In particular, we want to investigate algorithms for fusing sparse and unregistered data from visual and range sensors in a 3-D occupancy grid model and incrementally reconstruct a 3-D model which contains geometric and photometric details as well as knowledge about empty spaces. In this paper, we describe a novel sensor fusion based approach for 3-D modeling of indoor environments, and present some experimental results in the feature extraction and matching of the intensity and range images using Principal Component Analysis (PCA).

Our data acquisition process is very flexible, the sensors gather data from different locations and vantage points, constrained only on the sensors and robot motion. However, the pay-off of this flexibility is that the images of the same object or scene are at different resolutions, and the intensity and range images are highly unregistered, making their fusion a difficult task. Thus, the sensor fusion algorithm must be robust enough to account for unregistered data, translation, scaling and, the sparse and non-uniform data anticipated from the sensors.

The first step in the fusion process is to register or align the images into a common reference frame. To do this, similar features must be found and matched. We present experimental results using synthetic range and intensity images of an office-like environment. Feature extraction and matching is achieved by using PCA of features commonly encountered in man-made indoor scenes. For intensity, these features are: horizontal and vertical edge lines, corners and uniform color. For range, depth changes to extract boundary edges, and uniform planar regions. The eigenfeatures obtained from the training set using PCA, are invariant to scaling, since the images are taken at different resolutions. Regions where matching features may exist, are located by using the information of the viewpoints and locations where the images were acquired. For the moment, we assume that the viewpoints and locations are known, but our approach involves their estimation under a probabilistic framework by using the information of the robot pose, the relative positions of the sensors, and their projection models. We do not require that images from the same sensor overlap, however, we do require that images from different sensors partially overlap, for example, a range image may overlap with one, two or more intensity images. Under this squeme, regions where incomplete intensity and/or range data will exist, and we want to show that man-made indoor environments can be predicted at some level to estimate these regions of incomplete data based on nearby observations.