Skip navigation
Please use this identifier to cite or link to this item: http://arks.princeton.edu/ark:/88435/dsp01zk51vm00t
Title: Building Efficient Deep Neural Networks
Authors: Liu, Yuchen
Advisors: KungDavid, Sun-YuanWentzlaff
Contributors: Electrical and Computer Engineering Department
Keywords: Computer Micro-Architecture
Computer Vision
Deep Neural Networks
Efficient Deep Learning
Subjects: Computer engineering
Issue Date: 2022
Publisher: Princeton, NJ : Princeton University
Abstract: Deep neural networks (DNNs) have flourished a wide-range of artificial intelligence (AI) applications. The prevalent adoption of DNNs can be attributive to its high customizability for different tasks. In fact, researchers have designed variants of DNNs for different applications, e.g., convolutional neural networks (CNNs) for visual recognition, generative adversarial networks (GANs) for image synthesis, recurrent neural networks (RNNs) for time-series processing, etc. All these variants bear highly different network topologies and training objectives.Despite the success of DNNs, there is a growing concern for the efficiency of DNNs. Current DNNs are resource-hungry, setting a hard barrier for them to deploy on resource-limited edge devices. However, the broadness of applications that DNNs are adopted to imminently increase the difficulty of discovering efficient DNNs design for different variants. Due to such crucial diversity, it is hard to yield a generic approach to attain efficient DNNs with satisfactory performance across different applications. In this dissertation, we address the challenge for efficient design of DNNs in different domains, with a simple but intuitive and effective notion: DNNs themselves are customized for different learning objectives, so should the approaches to enhance their efficiency. With this notion, we present methodologies to design efficient CNNs, GANs, and RNNs. We first introduce a CNN compression algorithm, class-discriminative compression (CDC), which fits seamlessly with CNN’s class-discriminative training objective and provides a 1.8× acceleration for ResNet-50 on ImageNet without accuracy loss. We then perform an in-depth study into channel pruning for CNN compression. Driven by the objective of classification accuracy, we propose an evolutionary framework to automatically discover transferable pruning functions that outperform manual designs. We further investigate a different application of image synthesis with GAN. We observe that GAN is trained to synthesize realistic contents and thus pioneer a content-aware GAN compression method, which accelerate state-of-the-art models by 11× with negligible image quality loss. We finally expand our study to the domain of system designs where we aim to mitigate the memory wall by building efficient RNN data prefetcher. We develop an ML-architecture co-design strategy to speed up the state-of-the-art neural prefetchers by 15× with even better performance.
URI: http://arks.princeton.edu/ark:/88435/dsp01zk51vm00t
Type of Material: Academic dissertations (Ph.D.)
Language: en
Appears in Collections:Electrical Engineering

Files in This Item:
File Description SizeFormat 
Liu_princeton_0181D_14348.pdf23.99 MBAdobe PDFView/Download


Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.