Skip navigation
Please use this identifier to cite or link to this item: http://arks.princeton.edu/ark:/88435/dsp01hh63t027s
Title: Algorithms for understanding the spatial and network organization of biological systems
Authors: Chitra, Uthsav Rajaram
Advisors: Raphael, Benjamin J
Contributors: Computer Science Department
Keywords: computational biology
epistasis
hypergraphs
protein-protein interaction networks
spatial transcriptomics
Subjects: Computer science
Bioinformatics
Issue Date: 2024
Publisher: Princeton, NJ : Princeton University
Abstract: Biological systems are characterized by their spatial organization and network interactions at a hierarchy of scales. For example, the spatial arrangement of different cells in a tissue underlies fundamental multicellular processes such as tissue differentiation and disease response, while interactions between genes/proteins comprise the biological pathways that regulate cellular state and function. Recent developments in high-throughput sequencing have enabled the systematic analysis of spatial and network processes in many complex biological systems including the brain and tumor microenvironment. However, such analyses are challenged by high levels of sparsity and/or noise in high-throughput sequencing datasets—underscoring the need for principled and rigorous computational methods for biological data analysis. In this dissertation, we present a collection of mathematical frameworks and machine learning algorithms for modeling the spatial and network organization of biological systems. First, we derive a model of discrete and continuous spatial variation in gene expression. We present two algorithms, Belayer and GASTON, which learn the parameters of this model using complex analysis and interpretable deep learning, respectively. Second, we present a mathematical framework for the identification of altered subnetworks, or subnetworks of a biological interaction network containing genes/proteins that are differentially expressed, highly mutated, or otherwise aberrant compared to other genes/proteins. We prove that many existing algorithms are statistically biased, resolving the open question of why these algorithms often identify very large subnetworks that are difficult to interpret. We derive two altered subnetwork identification algorithms, NetMix and NetMix2, which we show are asymptotically unbiased and outperform existing approaches in practice. Finally, we present two frameworks for learning and modeling higher-order interactions. We first derive a statistical framework for learning higher-order genetic interactions from experimental fitness data, unifying decades of existing work in the genetics literature. Then, we derive a theoretical framework for modeling random walks on hypergraphs that provably utilizes higher-order interactions in data, in contrast to many existing hypergraph methods which only utilize pairwise interactions. Taken together, the approaches in this dissertation provide a theoretical and practical foundation for overcoming the computational challenges of modeling complex biological systems.
URI: http://arks.princeton.edu/ark:/88435/dsp01hh63t027s
Type of Material: Academic dissertations (Ph.D.)
Language: en
Appears in Collections:Computer Science

Files in This Item:
This content is embargoed until 2025-06-06. For questions about theses and dissertations, please contact the Mudd Manuscript Library. For questions about research datasets, as well as other inquiries, please contact the DataSpace curators.


Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.