Please use this identifier to cite or link to this item:
http://arks.princeton.edu/ark:/88435/dsp01xk81jp57r
Title: | Incorporating Entities into Open-Domain Question Answering |
Authors: | Deitelzweig, Jackson |
Advisors: | Chen, Danqi |
Department: | Computer Science |
Class Year: | 2022 |
Abstract: | Open-domain question answering requires accurate passage retrieval in order to facilitate document reading of candidate contexts. While dense methods such as Dense Passage Retrieval (DPR) have surpassed sparse methods like TF-IDF and BM25 in terms of retrieval accuracy, recent work has displayed a deficiency in retrieval for entity-centric questions. In this work, we aim to close the gap between dense and sparse retrievals on these questions by incorporating additional entity information into our models. We first establish baselines that use entity linking as a reranker for dense methods, which shows that this additional information is helpful for entity- centric questions. We then experiment with incorporating entity information into the encoder for DPR by using LUKE as the encoder. We find that for questions where the entities are in the vocabulary for our encoder, we are able to outperform DPR on entity-centric questions at only a slight hit to overall performance. Future work could try using a larger entity vocabulary, allowing better retrieval for rare entities and hopefully addressing this aspect of retriever generalization. |
URI: | http://arks.princeton.edu/ark:/88435/dsp01xk81jp57r |
Type of Material: | Princeton University Senior Theses |
Language: | en |
Appears in Collections: | Computer Science, 1987-2024 |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
DEITELZWEIG-JACKSON-THESIS.pdf | 2.8 MB | Adobe PDF | Request a copy |
Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.