Please use this identifier to cite or link to this item:
|Title:||Improved Tuned Iterative ReliefF: A Fast Filtering Method for Human Genetics|
|Abstract:||Identifying genotypes that are associated with disease phenotypes is an crucial problem in modern human genetics. Genetic data sets contain hundreds of thousands of genes, but only a small number are statistically associated with disease phenotypes. In addition, the phenomenon of epistasis means that there are groups of genes which are not statistically associated on their own, but together are associated with a disease phenotype. Relief algorithms are effective heuristics for detecting these groups of epistatic genes. They return a score for each gene such that genes associated with a disease phenotype are likely to be scored higher than genes which are not associated. In this paper we present runtime reductions of 1.42x and 3.15x to the two most recent Relief algorithms: MultiSURF* and TURF. This runtime reduction translates to many hours of saved time per algorithm run. A memoization technique allows our new version of TURF to achieve the same success rate in less than a third of the time. We also analyze the parameters for MultiSURF*, as these are currently not formally established. We show that it is difficult to select a parameter that will outperform the current one.|
|Type of Material:||Princeton University Senior Theses|
|Appears in Collections:||Computer Science, 1988-2017|
Files in This Item:
|Granizo-Mackenzie_Delaney_Thesis.pdf||587.88 kB||Adobe PDF||Request a copy|
Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.