Skip navigation
Please use this identifier to cite or link to this item: http://arks.princeton.edu/ark:/88435/dsp010z709073j
Title: Is it Cake? A Recipe for Data-Efficient Fossil Identification
Authors: Panigrahi, Indu
Advisors: Fong, Ruth
Maloof, Adam
Department: Computer Science
Class Year: 2023
Abstract: Most computer vision research focuses on datasets containing thousands of images of commonplace objects. However, many high-impact datasets, such as those in medicine and the geosciences, contain fine-grain objects that require domain-expert knowledge to recognize and are time-consuming to collect and annotate. As a result, these datasets contain few labeled images, and current computer vision models cannot intensively train on them. In this thesis, we present a data-efficient learning paradigm to identify ancient reef fossils in one such dataset. This dataset has profound implications for the impact of dwindling coral reefs on Earth's future climate and biosphere. Specifically, we explore the idea of curriculum learning to facilitate how a model processes image data. Using a limited amount of labeled data, we conduct an extensive exploratory data analysis on our images to define and implement a curriculum to structure the training of our model. When evaluating on our dataset, we find that our curriculum-based model outperforms the standard trained model while requiring less annotated data and leveraging unlabeled data. Furthermore, we find that the only annotations that our model needs are those that are easiest for the annotator to recognize. These findings have important implications for how researchers can save time when collecting and labeling image data for computer vision models. In exploring data-efficient avenues within deep learning for an impactful geosciences dataset, we hope that our work advances the accessibility of computer vision technology for a diverse range of applications.
URI: http://arks.princeton.edu/ark:/88435/dsp010z709073j
Type of Material: Princeton University Senior Theses
Language: en
Appears in Collections:Computer Science, 1987-2023

Files in This Item:
File Description SizeFormat 
PANIGRAHI-INDU-THESIS.pdf4.6 MBAdobe PDF    Request a copy


Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.