Skip navigation
Please use this identifier to cite or link to this item: http://arks.princeton.edu/ark:/88435/dsp01wd375z72c
 Title: The Performance of Elastic Net in Genome-based Disease Classification Authors: Kwok, Jonathan Advisors: Engelhardt, Barbara Department: Computer Science Class Year: 2016 Abstract: Genome-based data is becoming more accessible as the time and cost to sequence genomes decrease. A majority of studies using genome-based data focus on association tests to find relationships between mutations and traits but fewer studies look at using the data to produce disease prediction models. We look towards linear logistic regression, specifically a technique called elastic net, to build stable, sparse, and interpretable prediction models and compare the performance of the model to common forms of linear logistic regression, support vector machine, and principal component analysis. We find that elastic net produces sparse models but does not perform as well in practice as LASSO, another linear regression technique which also produces sparse models. We conclude from the experiment that LASSO is a better model to use, but suggest that we can use elastic net to verify the findings of the LASSO model Extent: 35 pages URI: http://arks.princeton.edu/ark:/88435/dsp01wd375z72c Type of Material: Princeton University Senior Theses Language: en_US Appears in Collections: Computer Science, 1988-2016

Files in This Item:
File SizeFormat
Kwok_Jonathan_thesis.pdf420.66 kBAdobe PDF

Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.