Skip navigation
Please use this identifier to cite or link to this item: http://arks.princeton.edu/ark:/88435/dsp015425kd88c
Title: Identifying noncoding genetic mutations that may confer breast cancer risk using SiFoN, a novel framework for analyzing noncoding variation in human disease
Authors: Macedo, Briana
Advisors: Troyanskaya, Olga
Department: Computer Science
Class Year: 2022
Abstract: Novel DNA sequencing technologies have illuminated the relationship between genetic mutations and disease. While an abundance of research has focused on the impact of mutations in the coding genome, there is still a gap in the field’s understanding of how mutations in the noncoding genome influence disease. While the majority of causal mutations discovered in Genomewide Association Studies (GWAS) are found in the noncoding genome, these studies fail to identify the functional importance of these mutations. Without an understanding of the underlying physiological mechanisms of disease, it is challenging to propose effective treatment strategies. In addition to this challenge, statistics-based studies such as GWAS fail to identify rare, disease-causing variants. To fill this gap, we present SiFoN (Seeker of intersecting Functions of Noncoding mutations), a novel framework that can be used to prioritize groups of regulatory mutations in noncoding genome for followup analysis. In particular, SiFoN can be used to understand the combined impact of interacting noncoding mutations, to visualize the impact of mutations in their genomic context, and to combine clinical information with functional predictions. SiFoN is built on top of Sei, a deep learning model that can predict the functional impact of noncoding genetic mutations from sequence alone. Most pathologies are the result of an accumulation of interacting mutations; therefore, SiFoN extends Sei to predict the joint impact of two or more mutations simultaneously using a smoothing algorithm. Second, SiFoN includes an analytical visualization pipeline that combines predictions at different scales to prioritize potentially causal mutations. Lastly, SiFoN makes it simple to combine clinical data with variant effect predictions. As a proof-of-concept, we apply SiFoN to the promoter region of the tumor suppressor gene PTEN and compare results to clinical data. Using SiFoN, we identified three groups of mutations that are predicted to confer breast cancer risk. These mutations lie within predicted transcription factor motifs that control immune cell regulation. This proof-of-concept study exemplifies how SiFoN can be used to identify novel regulatory mechanisms of disease.
URI: http://arks.princeton.edu/ark:/88435/dsp015425kd88c
Type of Material: Princeton University Senior Theses
Language: en
Appears in Collections:Computer Science, 1987-2023

Files in This Item:
File Description SizeFormat 
MACEDO-BRIANA-THESIS.pdf2.43 MBAdobe PDF    Request a copy


Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.