Skip navigation
Please use this identifier to cite or link to this item:
Title: Probabilistic models for structured biomedical data
Authors: Jones, Andrew
Advisors: Engelhardt, Barbara E
Contributors: Computer Science Department
Subjects: Bioinformatics
Issue Date: 2023
Publisher: Princeton, NJ : Princeton University
Abstract: Modern biomedical datasets---from molecular measurements of gene expression to pathology images---hold promise for discovering new therapeutics and probing basic questions about the behavior of cells. Thoughtful statistical modeling of these complex, high-dimensional data is crucial to elucidate robust scientific findings. A common assumption in data analysis that the data samples are independent and identically distributed. However, this assumption is nearly always violated in practice. This is especially true in the setting of biomedical data, which often exhibit some amount of structure, such as subgroups of patients, cells, or tissue types or other correlation structure among the samples. In this body of work, I propose data analysis and experimental design frameworks to account for several types of highly-structured biomedical data. These approaches, which take the form of Bayesian models and associated inference algorithms, are specifically tailored for datasets with group structure, multiple data modalities, and spatial organization of samples. In the first line of work, I propose a model for contrastive dimension reduction that decomposes the sources of variation in samples that belong to case and control conditions. Second, I propose a computational framework for aligning spatially-resolved genomics data into a common coordinate system that accounts for spatial correlation among the samples and models multiple data modalities. Finally, I propose a family of methods for optimally designing spatially-resolved genomics experiments that is tailored to the highly-structured data collection process of these studies. Together, this body of work advances the field of biomedical data analysis by developing models that directly exploit common types of structure within these data and demonstrating the advantage of these modeling approaches across an array of data types.
Type of Material: Academic dissertations (Ph.D.)
Language: en
Appears in Collections:Computer Science

Files in This Item:
File Description SizeFormat 
Jones_princeton_0181D_14396.pdf23.64 MBAdobe PDFView/Download

Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.