Please use this identifier to cite or link to this item: http://arks.princeton.edu/ark:/88435/dsp01h989r5537
 Title: Improving Noise Robustness in Automatic Speaker Recognition Systems Authors: Smith, Jamie Advisors: Moretti, Christopher Department: Computer Science Class Year: 2015 Abstract: Abstract Most automatic speaker recognition systems perform poorly on data with low signal-tonoise ratios (SNRs). In this paper, we analyze the performance of Spear, an open-source, comprehensive speaker recognition toolkit, on speech utterances from the TIMIT corpus injected with background noise. We suggest and implement changes for the voice activity detection (VAD) and feature extraction steps of the Spear toolchain to improve its overall noise robustness. Speci cally, we propose replacing Spear's simple VAD, which classi es frame-level energy into two groups, with a new VAD, which uses a posteriori signal-tonoise weighted energy distance. For feature extraction, we consider the e ectiveness of using gammatone frequency cepstral coe cients (GFCCs) instead of traditional mel-scale frequency cepstral coe cients (MFCCs). We prove the superiority of GFCCs for data with low SNRs by incorporating GFCC feature extraction in the Spear toolchain and then testing it on the noisy TIMIT data. Then, we further propose a new, modi ed version of MFCCs that is even more noise-robust than GFCCs. Extent: 60 pages URI: http://arks.princeton.edu/ark:/88435/dsp01h989r5537 Type of Material: Princeton University Senior Theses Language: en_US Appears in Collections: Computer Science, 1988-2016

Files in This Item:
File SizeFormat