Skip navigation
Please use this identifier to cite or link to this item: http://arks.princeton.edu/ark:/88435/dsp013x816q374
Title: Semantic Segmentation of 3D Point Cloud Lidar Data for Autonomous Vehicles
Authors: Thuremella, Divya
Advisors: Rusinkiewicz, Szymon
Department: Electrical Engineering
Class Year: 2018
Abstract: This paper examines a unique method of semantically segmenting 3D point cloud Lidar data by transforming that data into 2D and using a dilated convolutional neural network to semantically segment the 2D transformations. Lidar data is unique because it is 3D data taken from only one view point, which means the data can be projected onto a sphere around the view point and ‘unrolled’ until it is in the form of a 2D image. This idea is used to turn the set of 3D point cloud Lidar detection frames into a set of 2D images where very little information is lost in the transformation. Then, the dilated convolutional architecture specified in Yu 2015 [1] is used to semantically segment this data. Variations on this architecture are experimented with to obtain a more efficient neural network that performs with better accuracy. Furthermore, the performance of this network was examined on 5 main classes: road, car, pedestrian, cyclist, and vegetation. Binary segmentation was performed on each individual class, and then multiclass segmentation was performed where multiple class labels were predicted together. The data for these networks was taken from 252 labeled 3D point cloud frames of the KITTI dataset. The results show that the approach of transforming the 3D data into 2D is a very effective way of performing semantic segmentation, and that this method is much more efficient than other approaches that deal with the data in 3D. Furthermore, certain trends were found when changing architecture parameters that generally lead to better results. In both the binary segmentation and multiclass segmentation, it was shown that the network performed relatively well on the road and car classes, and fairly poorly on the pedestrian and cyclist classes. The vegetation class binary segmentation network, however, could barely detect anything at all, and therefore the vegetation class was not included in the multiclass networks. Future work should be done in developing a better neural network for classifying this type of data, and a bigger and more varied dataset should be used to train the network.
URI: http://arks.princeton.edu/ark:/88435/dsp013x816q374
Type of Material: Princeton University Senior Theses
Language: en
Appears in Collections:Electrical and Computer Engineering, 1932-2023

Files in This Item:
File Description SizeFormat 
THUREMELLA-DIVYA-THESIS.pdf2.28 MBAdobe PDF    Request a copy


Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.