Skip navigation
Please use this identifier to cite or link to this item: http://arks.princeton.edu/ark:/88435/dsp01sb397b99n
Title: TableReader: A Digital Humanities PDF Extraction Tool
Authors: Ji, Jessica
Advisors: Kernighan, Brian
Department: Computer Science
Class Year: 2018
Abstract: The emerging field of digital humanities attempts to address how computational tools and techniques can best be employed in the study of the humanities. The "humanities," as they are commonly understood, encompass a broad range of fields including language studies, literature, history, the classics, and the arts, although some of these fields may also overlap with the social sciences. Although few agree on what precisely the exact nature and scope of the digital humanities are, one fundamental challenge of the field is the complexity of data collection and analysis. This thesis centers on two guiding questions: why is digital humanities data analysis so challenging, and how can effective tools be designed to combat these challenges? To explore this second question more thoroughly this thesis introduces TableReader, a web application designed to simplify the extraction of tabular data from PDF documents, in the interest of developing a set of guiding principles surrounding digital humanities tool design and implementation.
URI: http://arks.princeton.edu/ark:/88435/dsp01sb397b99n
Type of Material: Princeton University Senior Theses
Language: en
Appears in Collections:Computer Science, 1987-2023

Files in This Item:
File Description SizeFormat 
JI-JESSICA-THESIS.pdf1.94 MBAdobe PDF    Request a copy


Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.