Skip navigation
Please use this identifier to cite or link to this item:
Title: Promises and Pitfalls of Generative AI: An AI-Safety Centric Approach
Authors: Sehwag, Vikash
Advisors: MittalChiang, PrateekMung
Contributors: Electrical and Computer Engineering Department
Keywords: Adversarial Robustness
Artificial Intelligence Safety
Deep Neural Networks
Generative Artificial Intelligence
Privacy Risks
Trustworthy Machine Learning
Subjects: Computer engineering
Electrical engineering
Computer science
Issue Date: 2023
Publisher: Princeton, NJ : Princeton University
Abstract: Artificial intelligence (AI) has advanced rapidly, leading to remarkable progress across numerous real-world applications. However, the prevalence of AI-enabled decisions also raises concerns about its potential safety risks, as AI systems are known to exhibit failure cases across multiple domains, such as autonomous driving, medical diagnostics, and content moderation. In this thesis, we investigate AI safety challenges through the lens of generative models, a class of machine learning models capable of approximating the underlying distribution of training datasets and synthesizing novel samples. By bridging the gap between generative models and AI safety, we reveal the immense potential of generative models in addressing safety challenges, while also identifying safety risks posed by contemporary generative models. First, we focus on improving generalization in adversarially robust learning with generative models by incorporating them into existing machine learning pipelines and distilling their knowledge by synthesizing novel synthetic images. We assess various generative models and propose a new metric (ARC), based on the indistinguishability of adversarially perturbed synthetic and real data, to accurately determine the generalization benefit of different generative models. Next, we investigate task-aware knowledge distillation from generative models, where we first demonstrate the disparate contributions of individual synthetic images in improving generalization. To adaptively sample images with the highest generalization benefit, we propose an adaptive sampling technique that guides the sampling process in diffusion models to maximize the generalization benefit of generated synthetic images. Next, we address the shortcomings of long-tailed data distributions, which underlie numerous challenges in AI safety, by using generative models to generate high-fidelity samples from low-density regions. We propose a novel low-density sampling process for diffusion models, guiding the process towards low-density regions while maintaining fidelity, and rigorously demonstrate that our process successfully generates novel high-fidelity samples from low-density regions. Finally, we demonstrate some of the key limitations of existing generative models. We first consider the outlier detection task and demonstrate the shortcomings of modern generative models in solving it. Considering our findings, we propose SSD, an unsupervised framework for outlier detection based on unlabeled in-distribution data. We further uncover that modern diffusion models, which are used by millions of users, leak the privacy of training data, where we extract a nontrivial number of training images from the pre-trained diffusion models. In summary, this thesis addresses multiple AI safety challenges and provides a comprehensive framework for the safety and reliability of AI systems under the new generative AI paradigm.
Type of Material: Academic dissertations (Ph.D.)
Language: en
Appears in Collections:Electrical Engineering

Files in This Item:
File Description SizeFormat 
Sehwag_princeton_0181D_14608.pdf38.16 MBAdobe PDFView/Download

Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.