Skip navigation
Please use this identifier to cite or link to this item: http://arks.princeton.edu/ark:/88435/dsp01bn9999982
Full metadata record
DC FieldValueLanguage
dc.contributor.advisorEngelhardt, Barbara E
dc.contributor.authorJones, Andrew
dc.contributor.otherComputer Science Department
dc.date.accessioned2023-03-06T22:54:52Z-
dc.date.available2023-03-06T22:54:52Z-
dc.date.created2022-01-01
dc.date.issued2023
dc.identifier.urihttp://arks.princeton.edu/ark:/88435/dsp01bn9999982-
dc.description.abstractModern biomedical datasets---from molecular measurements of gene expression to pathology images---hold promise for discovering new therapeutics and probing basic questions about the behavior of cells. Thoughtful statistical modeling of these complex, high-dimensional data is crucial to elucidate robust scientific findings. A common assumption in data analysis that the data samples are independent and identically distributed. However, this assumption is nearly always violated in practice. This is especially true in the setting of biomedical data, which often exhibit some amount of structure, such as subgroups of patients, cells, or tissue types or other correlation structure among the samples. In this body of work, I propose data analysis and experimental design frameworks to account for several types of highly-structured biomedical data. These approaches, which take the form of Bayesian models and associated inference algorithms, are specifically tailored for datasets with group structure, multiple data modalities, and spatial organization of samples. In the first line of work, I propose a model for contrastive dimension reduction that decomposes the sources of variation in samples that belong to case and control conditions. Second, I propose a computational framework for aligning spatially-resolved genomics data into a common coordinate system that accounts for spatial correlation among the samples and models multiple data modalities. Finally, I propose a family of methods for optimally designing spatially-resolved genomics experiments that is tailored to the highly-structured data collection process of these studies. Together, this body of work advances the field of biomedical data analysis by developing models that directly exploit common types of structure within these data and demonstrating the advantage of these modeling approaches across an array of data types.
dc.format.mimetypeapplication/pdf
dc.language.isoen
dc.publisherPrinceton, NJ : Princeton University
dc.subject.classificationBioinformatics
dc.subject.classificationBiostatistics
dc.subject.classificationStatistics
dc.titleProbabilistic models for structured biomedical data
dc.typeAcademic dissertations (Ph.D.)
pu.date.classyear2023
pu.departmentComputer Science
Appears in Collections:Computer Science

Files in This Item:
File Description SizeFormat 
Jones_princeton_0181D_14396.pdf23.64 MBAdobe PDFView/Download


Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.