Skip navigation
Please use this identifier to cite or link to this item:
Authors: Park, Christopher Young
Advisors: Troyanskaya, Olga G
Contributors: Computer Science Department
Subjects: Bioinformatics
Issue Date: 2014
Publisher: Princeton, NJ : Princeton University
Abstract: Biological systems have been extensively studied for over a century, however we still only have a partial functional and mechanistic understanding of the interplay between genes and pathways. Recently, there has been an exponential increase in experimental datasets generated. However, the complexity of data types and the ambiguity of dataset relevance to biological processes and pathways have limited the integrated usage of this vast knowledge base for directing biological discoveries in human and model organisms. In this thesis, I develop several approaches of utilizing such public effort to address the challenges of inferring gene function and diverse biomolecular interaction networks and of improving the transfer of functional knowledge between organisms to facilitate the investigation of understudied biological processes. Specifically, in the first part of the thesis, I show that computational functional genomics can be used to improve the transfer of gene annotations between organisms. Furthermore, I demonstrate that functional knowledge transfer, when coupled with machine learning algorithms, can improve the coverage and accuracy of gene function prediction in a diverse set of organisms. In the second part of the thesis, I provide a general method for simultaneous prediction of many interaction types genome-wide and present the results of applying this methodology in S. cerevisiae. By incrementally overlaying different interaction types as suggested by our results, investigators can make specific and testable novel hypotheses about new pathways, new pathway components, or new interconnections between existing pathways. Finally, I extend our interaction inference work in S. cerevisiae to mammalian organisms, by methodologically addressing the largest source of biological variation in the metazoan data compendium: tissue and cell-lineage heterogeneity.
Alternate format: The Mudd Manuscript Library retains one bound copy of each dissertation. Search for these copies in the library's main catalog
Type of Material: Academic dissertations (Ph.D.)
Language: en
Appears in Collections:Computer Science

Files in This Item:
File Description SizeFormat 
Park_princeton_0181D_10889.pdf2.98 MBAdobe PDFView/Download

Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.