Skip navigation
Please use this identifier to cite or link to this item: http://arks.princeton.edu/ark:/88435/dsp01p5547v57j
Title: Controllable Perturbation Generation via Interpolations with Variational Autoencoders
Authors: Sheng, Jenny
Advisors: Chen, Danqi
Department: Computer Science
Class Year: 2022
Abstract: Even though state-of-the-art pretrained language models already achieve very high performances on benchmark datasets, many demonstrate a lack of robustness. Towards the goal of improving model generalization, controllable perturbations play an integral role in expanding upon the original set of inputs and diagnosing models with a wider spectrum of data. However, existing works on controllable perturbations are all limited to a certain fixed subset of control codes. In this thesis research project, we propose a VAE controllable perturbation generation framework to create fluent, varied, and interesting transformations that are not restricted by any kind of control codes. Our few-shot generation method uses pairs of example transformations to manipulate inputs directly in the latent space such that they move towards the desired transformation. Without any restrictions on movement in the latent space, this approach offers more flexibility because we can define manipulations with pairs of examples instead of specifying them in advance during pretraining. We demonstrate the effectiveness of our generation framework through style transfer evaluation and two applications: contrast set creation and data augmentation. Through improved binary classification accuracies, we show that our VAE controllable perturbation generation framework is an effective tool for understanding model weaknesses and improving model performance on out-of-domain datasets.
URI: http://arks.princeton.edu/ark:/88435/dsp01p5547v57j
Type of Material: Princeton University Senior Theses
Language: en
Appears in Collections:Computer Science, 1987-2023

Files in This Item:
File Description SizeFormat 
SHENG-JENNY-THESIS.pdf1.74 MBAdobe PDF    Request a copy


Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.