Skip navigation
Please use this identifier to cite or link to this item:
Title: Automated Trainable Data Clustering With Applications in Astronomy
Authors: Minns, Charlie
Advisors: Melchior, Peter
Department: Physics
Certificate Program: Applications of Computing Program
Class Year: 2020
Abstract: One of the most commonly used techniques in data science is clustering: dividing a dataset into a certain number of groups so that the points in each group have similar properties. Different methods can be used to cluster data more efficiently and accurately, and one such method is Hierarchical Density-Based Spatial Clustering of Applications with Noise (HDBSCAN). In this thesis, I discuss my research on the applications of this method in astronomy and remote sensing. Using this algorithm, I produced results from hyperspectral images of the Nili Fossae region on Mars which are consistent with existing literature. This demonstrates HDBSCAN's ability to produce reliable clustering results. I also investigated different metrics used to cluster data in extragalactic surveys and measured the clustering success of HDBSCAN using training datasets. By comparing the success of different results, I found that it is possible to tune the input parameters to improve upon the clustering result. This thesis demonstrates the current capabilities of HDBSCAN and explores ways in which this algorithm can be improved to make the clustering process more autonomous.
Type of Material: Princeton University Senior Theses
Language: en
Appears in Collections:Physics, 1936-2020

Files in This Item:
File Description SizeFormat 
MINNS-CHARLIE-THESIS.pdf1.12 MBAdobe PDF    Request a copy

Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.