Skip navigation
Please use this identifier to cite or link to this item: http://arks.princeton.edu/ark:/88435/dsp01w3763950z
Title: Profile Hidden Markov Models for the Detection of Non-Ribosomal Peptide Synthetases within Metagenomic Sequence Data
Authors: Chang, Allison
Advisors: Abou Donia, Mohamed
Department: Computer Science
Certificate Program: Engineering Biology Program
Class Year: 2018
Abstract: Current algorithms have used profile Hidden Markov Models (pHMMs) for the automated detection of biosynthetic gene clusters (BGCs) in microbial genomes. However, these algorithms rely on the availability of well-assembled genomic input. Moreover, they are unable to tolerate unassembled or mixed genomes. Here, we evaluated the performance of available pHMMs to identify the Condensation domain in non-ribosomal protein synthetases (NRPSs) in the metagenomic sequence data of healthy American patients. We found that sensitivity increased from 56.22% using pHMMs to >70% in several segments of the pHMM (spHMM). The spHMM models performed better compared to the original pHMM model that they were built from and showed that there are several conserved region of residues that are more essential in detecting Condensation domains than others within a model. Our results provide a rapid detection method that is less computationally expensive for profiling the biosynthetic capacity of a large scale cohort.
URI: http://arks.princeton.edu/ark:/88435/dsp01w3763950z
Type of Material: Princeton University Senior Theses
Language: en
Appears in Collections:Computer Science, 1987-2023

Files in This Item:
File Description SizeFormat 
CHANG-ALLISON-THESIS.pdf1.16 MBAdobe PDF    Request a copy


Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.