Skip navigation
Please use this identifier to cite or link to this item:
Title: RGBD Pipeline for Indoor Scene Reconstruction and Understanding
Authors: Halber, Maciej Stanislaw
Advisors: Funkhouser, Thomas A
Contributors: Computer Science Department
Keywords: Indoor
Subjects: Computer science
Computer engineering
Issue Date: 2019
Publisher: Princeton, NJ : Princeton University
Abstract: In this work, we consider the problem of reconstructing a 3D model from a sequence of color and depth frames. Generating such a model has many important applications, ranging from the entertainment industry to real estate. However, transforming the RGBD frames into high-quality 3D models is a challenging problem, especially if additional semantic information is required. In this document, we introduce three projects, which implement various stages of a robust RGBD processing pipeline. First, we consider the challenges arising during the RGBD data capture process. While the depth cameras are providing dense, per-pixel depth measurements, there is a non-trivial error associated with the resulting data. We discuss the depth generation problem and propose an error reduction technique based on estimating an image-space undistortion field. We describe the capture process of the data required for the generation of such an undistortion field. We showcase how correcting the depth measurements improves the reconstruction quality. Second, we address the problem of registering RGBD frames over a long video sequence into a globally consistent 3D model. We propose a ``fine-to-coarse'' global registration algorithm that leverages robust registrations at finer scales to seed detection and enforcement of geometrical constraints, modeled as planar structures, at coarser scales. To test global registration algorithms, we provide a benchmark with 10,401 manually-clicked point correspondences in 25 scenes from the SUN3D dataset. We find that our fine-to-coarse algorithm registers long RGBD sequences better than previous methods. Last, we show how repeated scans of the same space can be used to establish associations between the different observations. Specifically, we consider a situation where 3D scans are acquired repeatedly at sparse time intervals. We develop an algorithm that analyzes these “rescans” and builds a temporal model of a scene with semantic instance information. The proposed algorithm operates inductively by using a temporal model resulting from past observations to infer instance segmentation of a new scan. The temporal model is continuously updated to reflect the changes that occur in the scene over time, providing object associations across time. The algorithm outperforms alternate approaches based on state-of-the-art networks for semantic instance segmentation.
Alternate format: The Mudd Manuscript Library retains one bound copy of each dissertation. Search for these copies in the library's main catalog:
Type of Material: Academic dissertations (Ph.D.)
Language: en
Appears in Collections:Computer Science

Files in This Item:
File Description SizeFormat 
Halber_princeton_0181D_13092.pdf70.09 MBAdobe PDFView/Download

Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.