Skip navigation
Please use this identifier to cite or link to this item:
Title: Keyword-assisted LDA: Exploring New Methods for Supervised Topic Modeling
Authors: Abdurehman, Rahji
Advisors: Imai, Kosuke
Department: Computer Science
Class Year: 2015
Abstract: This paper introduces an alternative to the popular machine learning algorithm known as Latent Dirichlet Allocation, or LDA for short. In this paper we derive the theory behind this alternative algorithm and demonstrate a specific use case for it with sample results. We call this new algorithm "keyword-assisted LDA". It works by taking a set of constraints which are set based on prior knowledge of the underlying topic structure within a corpus and then ensuring that they are maintained. Depending on one’s underlying implementation of LDA, keeping these constraints in order takes a variety of forms. This paper delves into the details for implementations using Gibbs sampling or Expectation-Maximization.
Extent: 36 pages
Access Restrictions: Walk-in Access. This thesis can only be viewed on computer terminals at the Mudd Manuscript Library.
Type of Material: Princeton University Senior Theses
Language: en_US
Appears in Collections:Computer Science, 1988-2016

Files in This Item:
File SizeFormat 
PUTheses2015-Abdurehman_Rahji.pdf609.88 kBAdobe PDF    Request a copy

Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.