Topics in the Statistical and Computational Complexities of Modern Machine Learning

Yang, Zhuoran

Please use this identifier to cite or link to this item: http://arks.princeton.edu/ark:/88435/dsp01pz50h030q

Full metadata record

DC Field	Value	Language
dc.contributor.advisor	FanLiu, JianqingHan
dc.contributor.author	Yang, Zhuoran
dc.contributor.other	Operations Research and Financial Engineering Department
dc.date.accessioned	2022-10-10T19:49:53Z	-
dc.date.available	2022-10-10T19:49:53Z	-
dc.date.created	2022-01-01
dc.date.issued	2022
dc.identifier.uri	http://arks.princeton.edu/ark:/88435/dsp01pz50h030q	-
dc.description.abstract	Modern machine learning methods feature a combination of complex statistical models and algorithms, scalable computing architectures, huge amounts of training data and computational resources. In particular, such a requirement of huge data and computational resources can be prohibitive in applications where data acquisition is costly or computation budget is limited. Thus, to democratize machine learning technology, it is pivotal to gain deeper insights into questions about data and computation. In this thesis, we aim to (a) delineate the fundamental limits involving statistical accuracy and computationally efficiency in heterogeneous statistical models, and (b) characterize the statistical and computational performances of reinforcement learning algorithms with neural network models. In Chapter 2, we study the fundamental tradeoffs between statistical accuracy and computational tractability in the analysis of high dimensional heterogeneous models including sparse Gaussian mixture model and mixture of sparse linear regressions. We exploit an oracle-based computational model to establish conjecture-free computationally feasible minimax lower bounds, which show that there exist significant gaps between computationally feasible minimax risks and classical ones. These gaps quantify the statistical price we must pay to achieve computational tractability in the presence of data heterogeneity. Furthermore, in Chapter 3, we make the first attempt to theoretically understand the deep Q-network (DQN) algorithm (Mnih et al., 2015) from both algorithmic and statistical perspectives. In specific, we focus on a slight simplification of DQN that fully captures its key features. Under mild assumptions, we establish the algorithmic and statistical rates of convergence for the action-value functions obtained by DQN. In particular, the statistical error characterizes the bias and variance that arise from approximating the action-value function using deep neural network, while the algorithmic error converges to zero at a geometric rate. As a byproduct, our analysis provides justifications for the techniques of experience replay and target network, which are crucial to the empirical success of DQN.
dc.format.mimetype	application/pdf
dc.language.iso	en
dc.publisher	Princeton, NJ : Princeton University
dc.relation.isformatof	The Mudd Manuscript Library retains one bound copy of each dissertation. Search for these copies in the library's main catalog: <a href=http://catalog.princeton.edu>catalog.princeton.edu</a>
dc.subject	Deep Q-Learning
dc.subject	Gaussian Mixture Models
dc.subject	Markov Game
dc.subject	Reinforcement Learning
dc.subject	Statistical-Computational Tradeoff
dc.subject.classification	Statistics
dc.title	Topics in the Statistical and Computational Complexities of Modern Machine Learning
dc.type	Academic dissertations (Ph.D.)
pu.date.classyear	2022
pu.department	Operations Research and Financial Engineering
Appears in Collections:	Operations Research and Financial Engineering

Files in This Item:

File	Description	Size	Format
Yang_princeton_0181D_14037.pdf		1.86 MB	Adobe PDF	View/Download

Show simple item record

Search

Browse