Skip navigation
Please use this identifier to cite or link to this item: http://arks.princeton.edu/ark:/88435/dsp015999n6013
Title: Comparing Normalization Methods for Differential Expression Analysis on RNA-Sequence Data from Autism Samples
Authors: Yu, Susanna
Advisors: Fan, Jianqing
Department: Operations Research and Financial Engineering
Class Year: 2017
Abstract: Identifying genetic variation and differential gene expression in Autism patients has been one of the main efforts to understand underlying causes of Autism spectrum disorder. Using brain tissue samples from Autism and Control patients, RNA-sequencing is used to record gene-level read counts that must be normalized across the Autism and Control samples in order to remove within-lane and between-lane biases in the data. Highly sensitive to normalization, the subsequent identification of differentially expressed genes will require an analysis of variance approach. This study compares three normalization methods: EDASeq, DESeq2 and TMM. These methods will be compared based on their reproducibility of differential gene expression analysis results derived from multiple bootstrap samples normalized by these methods. The ultimate goal is to identify a more reliable normalization technique for the RNA-sequence data and identify genes that are differentially expressed between the Autism and Control samples. By comparing gene frequencies and ranks, we found that DESeq2 normalization results in the highest level of reproducibility and EDASeq results in the lowest reproducibility, with TMM performing similarly to DESeq2. Biological interpretations of differentially expressed genes will be discussed.
URI: http://arks.princeton.edu/ark:/88435/dsp015999n6013
Type of Material: Princeton University Senior Theses
Language: en_US
Appears in Collections:Operations Research and Financial Engineering, 2000-2024

Files in This Item:
File SizeFormat 
yu_susanna.pdf889.25 kBAdobe PDF    Request a copy


Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.