Skip navigation
Please use this identifier to cite or link to this item: http://arks.princeton.edu/ark:/88435/dsp01h989r5537
Title: Improving Noise Robustness in Automatic Speaker Recognition Systems
Authors: Smith, Jamie
Advisors: Moretti, Christopher
Department: Computer Science
Class Year: 2015
Abstract: Abstract Most automatic speaker recognition systems perform poorly on data with low signal-tonoise ratios (SNRs). In this paper, we analyze the performance of Spear, an open-source, comprehensive speaker recognition toolkit, on speech utterances from the TIMIT corpus injected with background noise. We suggest and implement changes for the voice activity detection (VAD) and feature extraction steps of the Spear toolchain to improve its overall noise robustness. Speci cally, we propose replacing Spear's simple VAD, which classi es frame-level energy into two groups, with a new VAD, which uses a posteriori signal-tonoise weighted energy distance. For feature extraction, we consider the e ectiveness of using gammatone frequency cepstral coe cients (GFCCs) instead of traditional mel-scale frequency cepstral coe cients (MFCCs). We prove the superiority of GFCCs for data with low SNRs by incorporating GFCC feature extraction in the Spear toolchain and then testing it on the noisy TIMIT data. Then, we further propose a new, modi ed version of MFCCs that is even more noise-robust than GFCCs.
Extent: 60 pages
URI: http://arks.princeton.edu/ark:/88435/dsp01h989r5537
Type of Material: Princeton University Senior Theses
Language: en_US
Appears in Collections:Computer Science, 1987-2023

Files in This Item:
File SizeFormat 
PUTheses2015-Smith_Jamie.pdf413.62 kBAdobe PDF    Request a copy


Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.