3D Representations for Learning to Reconstruct and Segment Shapes

Genova, Kyle  Adam

Please use this identifier to cite or link to this item: http://arks.princeton.edu/ark:/88435/dsp01b2773z77c

Full metadata record

DC Field	Value	Language
dc.contributor.advisor	Funkhouser, Thomas
dc.contributor.author	Genova, Kyle Adam
dc.contributor.other	Computer Science Department
dc.date.accessioned	2021-06-10T17:38:38Z	-
dc.date.available	2021-06-10T17:38:38Z	-
dc.date.issued	2021
dc.identifier.uri	http://arks.princeton.edu/ark:/88435/dsp01b2773z77c	-
dc.description.abstract	The focus of this dissertation is the novel use of shape representations to empower 3D reasoning for reconstruction and segmentation. It is organized into three sections based on application: domain specific shape reconstruction (Chapter 2), general shape reconstruction (Chapter 3), and semantic segmentation (Chapter 4). In each chapter, we outline the setting and related work, and then introduce one or two approaches with a novel use of shape representation. Our key contribution is to use shape representation to enable new types of supervision and improve generalization when learning 3D priors. Because current reconstruction and segmentation methods share the use of learned 3D encoder and decoder architectures, these contributions apply to both tasks. In Chapters 2-4, we demonstrate experimentally that reconstruction and segmentation algorithms benefit from our choices of shape representation. A primary benefit of our approaches is enabling new types of supervision that require some property of the representation to be effective. Domain specific representation enables supervising 3D face reconstruction with a face recognition network for the first time, resulting in provably more recognizable reconstructions (Chapter 2). Our SIF representation learns shape correspondence from only reconstruction supervision (Chapter 3). Large, diverse image collections are already semantically labeled, making it possible to train 3D semantic segmentation models for datasets without point cloud annotations (Chapter 4). A secondary benefit is improved generalization, by deriving better priors from existing supervision. We propose a new shape representation, LDIF, which is trained on existing 3D reconstruction data. LDIF learns robust local priors, improving generalization to unseen classes and shapes (Chapter 3). The addition of image-based supervision in segmentation algorithms improves generalization to cities with no 3D supervision (Chapter 4). We conclude that our choices of representation enable new supervision, better generalization, and learning useful 3D priors from readily available labels (e.g., labeled and unlabeled images, or unlabeled shape collections). We hypothesize that effective future representations will build on this trend by deriving higher level semantic priors from unannotated datasets and other inexpensive sources of supervision (Chapter 5).
dc.language.iso	en
dc.publisher	Princeton, NJ : Princeton University
dc.relation.isformatof	The Mudd Manuscript Library retains one bound copy of each dissertation. Search for these copies in the library's main catalog: <a href=http://catalog.princeton.edu> catalog.princeton.edu </a>
dc.subject	3D Reconstruction
dc.subject	Computer Graphics
dc.subject	Computer Vision
dc.subject	Differentiable Rendering
dc.subject	Semantic Segmentation
dc.subject	Shape Representation
dc.subject.classification	Computer science
dc.title	3D Representations for Learning to Reconstruct and Segment Shapes
dc.type	Academic dissertations (Ph.D.)
Appears in Collections:	Computer Science

Files in This Item:

File	Description	Size	Format
Genova_princeton_0181D_13648.pdf		33.35 MB	Adobe PDF	View/Download

Show simple item record

Search

Browse