Skip navigation
Please use this identifier to cite or link to this item: http://arks.princeton.edu/ark:/88435/dsp01p8418r442
Full metadata record
DC FieldValueLanguage
dc.contributor.advisorKung, Sun-Yuan SY
dc.contributor.authorHou, Zejiang
dc.contributor.otherElectrical and Computer Engineering Department
dc.date.accessioned2022-12-02T20:55:18Z-
dc.date.available2022-12-02T20:55:18Z-
dc.date.created2022-01-01
dc.date.issued2022
dc.identifier.urihttp://arks.princeton.edu/ark:/88435/dsp01p8418r442-
dc.description.abstractDeep learning has achieved great and broad breakthroughs in numerous real-world applications with the advancement of larger size models and the explosive growth and availability of data. How- ever, the deep learning models usually have excessive computational and memory cost that are not friendly to practical deployment on mobile or edge devices. Moreover, the deep learning models face challenges in learning and adapting rapidly from only a few examples to solve new tasks. Hence, this thesis proposes techniques to learn computationally efficient model architectures and methods to improve the few-shot learning ability. We start with subspace analysis methods with application to the feature selection problem. We then extend these methods to deep neural network structural learning (SL), with the objective of reducing the redundant parameters to obtain the optimal down-sized model that can retain or even improve the accuracy. More efficient SL method based on the hybrid pruning-regrowing technique and more generalized SL method that can reduce the model across many more dimensions are also introduced. Going beyond static model designs, we also present dynamic neural network approaches that can adapt the model weights and architectures to different inputs on-the-fly during inference to control the computation efficiency and improve the representation ability. Apart from model efficiency, we also present techniques to train models that can rapidly generalize from a few examples. We propose a few-shot architecture adaption method to customize task-specific model structure for diverse few-shot tasks by meta-learning a task-aware architecture controller. Different from the traditional NAS methods that require a separate search cost on each new task, our method directly generates the task-specific model structure from the dataset in GPU minutes after a one- time meta-training cost. Finally, we propose a cross-modality self-supervised learning framework by masked image pretraining on language assisted representations. The resulting models produce high quality transferable representations that advance the accuracy on numerous computer vision tasks and demonstrate strong robustness to adversarial/out-of-distribution samples. Moreover, the resulting models are amenable to structural learning for greater computation efficiency gains and to low-resource task adaptation for better data efficiency.
dc.format.mimetypeapplication/pdf
dc.language.isoen
dc.publisherPrinceton, NJ : Princeton University
dc.subject.classificationElectrical engineering
dc.titleModel and Data Efficiency in Deep Learning
dc.typeAcademic dissertations (Ph.D.)
pu.date.classyear2022
pu.departmentElectrical and Computer Engineering
Appears in Collections:Electrical Engineering

Files in This Item:
File Description SizeFormat 
Hou_princeton_0181D_14321.pdf31.45 MBAdobe PDFView/Download


Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.