Skip navigation
Please use this identifier to cite or link to this item: http://arks.princeton.edu/ark:/88435/dsp01xk81jp57r
Title: Incorporating Entities into Open-Domain Question Answering
Authors: Deitelzweig, Jackson
Advisors: Chen, Danqi
Department: Computer Science
Class Year: 2022
Abstract: Open-domain question answering requires accurate passage retrieval in order to facilitate document reading of candidate contexts. While dense methods such as Dense Passage Retrieval (DPR) have surpassed sparse methods like TF-IDF and BM25 in terms of retrieval accuracy, recent work has displayed a deficiency in retrieval for entity-centric questions. In this work, we aim to close the gap between dense and sparse retrievals on these questions by incorporating additional entity information into our models. We first establish baselines that use entity linking as a reranker for dense methods, which shows that this additional information is helpful for entity- centric questions. We then experiment with incorporating entity information into the encoder for DPR by using LUKE as the encoder. We find that for questions where the entities are in the vocabulary for our encoder, we are able to outperform DPR on entity-centric questions at only a slight hit to overall performance. Future work could try using a larger entity vocabulary, allowing better retrieval for rare entities and hopefully addressing this aspect of retriever generalization.
URI: http://arks.princeton.edu/ark:/88435/dsp01xk81jp57r
Type of Material: Princeton University Senior Theses
Language: en
Appears in Collections:Computer Science, 1987-2024

Files in This Item:
File Description SizeFormat 
DEITELZWEIG-JACKSON-THESIS.pdf2.8 MBAdobe PDF    Request a copy


Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.