Skip navigation
Please use this identifier to cite or link to this item: http://arks.princeton.edu/ark:/88435/dsp01ft848t75h
Full metadata record
DC FieldValueLanguage
dc.contributor.advisorEngelhardt, Barbara E
dc.contributor.authorJo, Brian
dc.contributor.otherQuantitative Computational Biology Department
dc.date.accessioned2021-10-04T13:47:59Z-
dc.date.available2021-10-04T13:47:59Z-
dc.date.created2021-01-01
dc.date.issued2021
dc.identifier.urihttp://arks.princeton.edu/ark:/88435/dsp01ft848t75h-
dc.description.abstractAs we experience exponential growth in available sequencing data for humans, a detailed understanding of the genetic basis of variation in gene expression is being achieved. One of the ways we are observing this is through genome-wide association mapping of distal expression quantitative loci (distal eQTLs). Unlike cis-eQTLs, distal eQTLs face numerous challenges, including weak signal and large number of hypotheses (contributing to low power), as well as numerous and often cryptic sources of confounding. However, despite all of these challenges, an extensive repository of human distal eQTLs can prove highly valuable; when used in conjunction with cis-eQTLs, GWAS of complex traits, gene networks and more recently, single cell datasets, a highly detailed picture of gene regulation can be achieved. This work first presents the most extensive mapping of human distal eQTLs to date, and explore approaches to improve their quality. Using the Genotype-Tissue Expression (GTEx) dataset with $838$ donors with samples across $49$ human tissues, I present a repository of over $5000$ distal eQTLs across multiple tissues. This work was one of the major contributions to the GTEx consortium publications, which contributes to the field of human genomics with an extensive public repository of genotype, expression, and other phenotypes, along with summary statistics. This work also focuses on exploring the relationship among the cis-eGene, trans-eGene and sample heterogeneity, or cell type information when available. We show that in many reported distal eQTLs, sample heterogeneity can play a regulatory role, while in other cases, a relationship between cis-eGene and trans-eGene can be narrowed down to specific cell types. Finally, I take a statistical approach and analyze the landscape of association statistics, especially in the context of covariance structures among variants, genes and tissues. I find that if we take these covariance structures into account properly, the significance testing procedure can yield results that depart significantly from results obtained using common methods, which treat all tissues, genes and variants identically. All of these results contribute to human genetics research by identifying potential issues with the ever increasing collection of human distal eQTLs, and how we can address them.
dc.format.mimetypeapplication/pdf
dc.language.isoen
dc.publisherPrinceton, NJ : Princeton University
dc.relation.isformatofThe Mudd Manuscript Library retains one bound copy of each dissertation. Search for these copies in the library's main catalog: <a href=http://catalog.princeton.edu>catalog.princeton.edu</a>
dc.subjecteQTL
dc.subjectFalse Discovery Rate
dc.subjectGTEx
dc.subjectGWAS
dc.subjectHuman Genetics
dc.subjectStatistical Genetics
dc.subject.classificationGenetics
dc.subject.classificationBioinformatics
dc.titleAnalysis of Distal eQTLs across Multiple Human Tissues and Methods to Improve Their Quality
dc.typeAcademic dissertations (Ph.D.)
pu.date.classyear2021
pu.departmentQuantitative Computational Biology
Appears in Collections:Quantitative Computational Biology

Files in This Item:
File Description SizeFormat 
Jo_princeton_0181D_13795.pdf10.5 MBAdobe PDFView/Download


Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.