Please use this identifier to cite or link to this item:
http://arks.princeton.edu/ark:/88435/dsp01cj82kb685
Title: | Personal Sound Zone Rendering with Listener Individualization and Head Tracking |
Authors: | Qiao, Yue |
Advisors: | Choueiri, Edgar Y. |
Contributors: | Mechanical and Aerospace Engineering Department |
Keywords: | array signal processing head tracking listener individualization personal sound zones sound field control spatial audio |
Subjects: | Acoustics Electrical engineering |
Issue Date: | 2024 |
Publisher: | Princeton, NJ : Princeton University |
Abstract: | The impact of listener variability factors on the performance of personal sound zone (PSZ) rendering is investigated, and methods for incorporating listener individualization and head tracking into PSZ systems are proposed. PSZ rendering allows different audio programs to be delivered to listeners in the same space with minimal interference, using loudspeakers equipped with digital filters. A significant challenge in PSZ rendering is the mismatch between the acoustic transfer functions (ATFs) used in filter design and those in the actual environment, particularly due to listener variability factors, which have not been previously studied. Two categories of listener variability factors are considered: variations in listeners' anthropometric features (e.g., head and torso shapes) and listener head movements. For static PSZs, the impact of individualizing binaural room transfer functions (BRTFs) on PSZ performance is experimentally investigated using metrics such as Inter-Zone Isolation and Inter-Program Isolation. Results show that individualizing BRTFs can significantly improve isolation, but at the cost of reduced robustness against head misalignments. Additionally, an inter-listener BRTF coupling effect is observed, which can negatively impact performance for both listeners when a single listener's BRTFs are mismatched. For head-tracked PSZ rendering, requirements for the spatial sampling of BRTFs concerning listener head translations and rotations are explored. The required sampling resolution is found to depend on factors such as the moving listener's position, the frequency band of the rendered audio, and perturbations caused by the other listener. Furthermore, a deep learning-based framework using a spatially adaptive neural network (SANN) is proposed and evaluated. The SANN model takes head positions as input and outputs the corresponding PSZ filter coefficients. It can be trained either with simulated ATFs with data augmentation for robustness in uncertain environments or with a mix of simulated and measured ATFs for customization under known conditions. Compared to traditional filter design methods, SANN is shown to provide similar or better isolation performance in unknown rendering environments while being more efficient in computation and storage, making it suitable for real-time rendering of PSZs that adapt to listeners' concurrent head movements. |
URI: | http://arks.princeton.edu/ark:/88435/dsp01cj82kb685 |
Type of Material: | Academic dissertations (Ph.D.) |
Language: | en |
Appears in Collections: | Mechanical and Aerospace Engineering |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
Qiao_princeton_0181D_15317.pdf | 44.89 MB | Adobe PDF | View/Download |
Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.