Skip navigation
Please use this identifier to cite or link to this item:
Title: Improved Tuned Iterative ReliefF: A Fast Filtering Method for Human Genetics
Department: Computer Science
Class Year: 2014
Abstract: Identifying genotypes that are associated with disease phenotypes is an crucial problem in modern human genetics. Genetic data sets contain hundreds of thousands of genes, but only a small number are statistically associated with disease phenotypes. In addition, the phenomenon of epistasis means that there are groups of genes which are not statistically associated on their own, but together are associated with a disease phenotype. Relief algorithms are effective heuristics for detecting these groups of epistatic genes. They return a score for each gene such that genes associated with a disease phenotype are likely to be scored higher than genes which are not associated. In this paper we present runtime reductions of 1.42x and 3.15x to the two most recent Relief algorithms: MultiSURF* and TURF. This runtime reduction translates to many hours of saved time per algorithm run. A memoization technique allows our new version of TURF to achieve the same success rate in less than a third of the time. We also analyze the parameters for MultiSURF*, as these are currently not formally established. We show that it is difficult to select a parameter that will outperform the current one.
Extent: 63 pages
Type of Material: Princeton University Senior Theses
Language: en_US
Appears in Collections:Computer Science, 1988-2017

Files in This Item:
File SizeFormat 
Granizo-Mackenzie_Delaney_Thesis.pdf587.88 kBAdobe PDF    Request a copy

Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.