Skip navigation
Please use this identifier to cite or link to this item: http://arks.princeton.edu/ark:/88435/dsp01br86b589m
Title: Cigarette Helmets & Horse Wars: Towards a Better Understanding of Noun Compound Interpretability
Authors: Marsh, Charles
Advisors: Fellbaum, Christiane
Department: Computer Science
Class Year: 2015
Abstract: The computational linguistics community has shown resurgent interest in the research of noun compounds, or sequences of nouns used to describe a single entity. Human judges frequently encounter noun compounds, be they familiar, like coffee cup and park bench , or unfamiliar, like cigarette helmet and horse war . Astoundingly, even these unfamiliar compounds are often interpretable with very little effort and in a manner that is widely agreeable to judges. This ease of interpretation is a testament to the productivity, generativity, and diversity of language in general and noun compounds in particular. However, it is clear that certain combinations of nouns would produce compounds that are not interpretable, or at least, incapable of being interpreted in a sensible manner. For example, devising a reasonable interpretation for the compound pork plum would be a daunting, if not impossible task. In this thesis, we test the limits of both human creativity and noun compound productivity, asking the question: “What makes a noun compound interpretable?” Though simple in formulation, this question has received little attention in prior research on compounds. Our analysis revolves around a series of experiments run on Amazon’s Mechanical Turk platform in which human judges were asked to interpret and paraphrase binary and ternary noun compounds that had been generated ‘at random’ using an algorithmic process. Throughout this thesis, we analyze the results of these experiments to construct a more complete theory of noun compound interpretability, demonstrating the usefulness of semantic and lexical similarity-based comparisons to familiar compounds in determining the degree to which a new, unfamiliar compound is itself interpretable, as well as the deep and even intrinsic link between the acts of paraphrasing and interpretation.
Extent: 115 pages
URI: http://arks.princeton.edu/ark:/88435/dsp01br86b589m
Type of Material: Princeton University Senior Theses
Language: en_US
Appears in Collections:Computer Science, 1987-2023

Files in This Item:
File SizeFormat 
PUTheses2015-Marsh_Charles.pdf1.91 MBAdobe PDF    Request a copy


Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.