Skip navigation
Please use this identifier to cite or link to this item: http://arks.princeton.edu/ark:/88435/dsp013197xq26c
Full metadata record
DC FieldValueLanguage
dc.contributor.advisorArora, Sanjeev
dc.contributor.authorLuo, Yuping
dc.contributor.otherComputer Science Department
dc.date.accessioned2022-10-10T19:50:38Z-
dc.date.available2022-10-10T19:50:38Z-
dc.date.created2022-01-01
dc.date.issued2022
dc.identifier.urihttp://arks.princeton.edu/ark:/88435/dsp013197xq26c-
dc.description.abstractRecent advances in deep reinforcement learning have demonstrated its great potential for real-world problems. However, two concerns prevent reinforcement learning from being applied: Efficiency and Efficacy. This dissertation studies how to improve the efficiency and efficacy of reinforcement learning by designing deep model-based algorithms. The access to dynamics models empowers the algorithms to plan, which is key to sequential decision making. This dissertation covers four topics: online reinforcement learning, the expressivity of neural networks in deep reinforcement learning, offline reinforcement learning, and safe reinforcement learning. For online reinforcement learning, we present an algorithmic framework with theoretical guarantees by utilizing a lower bound of performance the policy learned in the learned environment can obtain in the real environment. We also empirically verify the efficiency of our proposed method. For expressivity of neural networks in deep reinforcement learning, we prove that in some scenarios, the model-based approaches can require much less representation power to approximate a near-optimal policy than model-free approaches, and empirically show that this can be an issue in simulated robotics environments and a model-based planner can help. For offline reinforcement learning, we devise an algorithm that enables the policy to stay close to the provided expert demonstration set to reduce distribution shift, and we also conduct experiments to demonstrate the efficacy of our methods to improve the success rate for robotic arm manipulation tasks in simulated environments. For safe reinforcement learning, we propose a method that uses the learned dynamics model to certify safe states, and our experiments show that our method can learn a decent policy without a single safety violation during training in a set of simple but challenging tasks, while baseline algorithms have hundreds of safety violations.
dc.format.mimetypeapplication/pdf
dc.language.isoen
dc.publisherPrinceton, NJ : Princeton University
dc.relation.isformatofThe Mudd Manuscript Library retains one bound copy of each dissertation. Search for these copies in the library's main catalog: <a href=http://catalog.princeton.edu>catalog.princeton.edu</a>
dc.subjectDeep Learning
dc.subjectMachine Learning
dc.subjectReinforcement Learning
dc.subject.classificationComputer science
dc.subject.classificationArtificial intelligence
dc.titleTowards Efficient and Effective Deep Model-based Reinforcement Learning
dc.typeAcademic dissertations (Ph.D.)
pu.date.classyear2022
pu.departmentComputer Science
Appears in Collections:Computer Science

Files in This Item:
File Description SizeFormat 
Luo_princeton_0181D_14201.pdf4.11 MBAdobe PDFView/Download


Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.