Skip navigation
Please use this identifier to cite or link to this item: http://arks.princeton.edu/ark:/88435/dsp015h73q0433
Full metadata record
DC FieldValueLanguage
dc.contributor.advisorNarasimhan, Karthik
dc.contributor.authorYao, Shunyu
dc.contributor.otherComputer Science Department
dc.date.accessioned2024-07-24T16:32:21Z-
dc.date.available2024-07-24T16:32:21Z-
dc.date.created2024-01-01
dc.date.issued2024
dc.identifier.urihttp://arks.princeton.edu/ark:/88435/dsp015h73q0433-
dc.description.abstractBuilding autonomous agents to interact with the world lies at the core of artificial intelligence (AI). This thesis introduces "language agents'', a new category of agents that utilize large language models (LLMs) to reason to act, marking a departure from traditional agents via extensive rule design or learning. It is developed in three parts: Part I motivates the necessity for language agents by introducing a new set of AI problems and benchmarks based on interaction with large-scale, real-world computer environments, such as the Internet or code interfaces. These "digital automation'' tasks present tremendous value for alleviating tedious labor and improving our lives, yet pose significant challenges for prior agent or LLM methods in decision-making over open-ended natural language and long horizon, calling for new methodologies. Part II lays the methodological foundation for language agents, where the key idea is to apply LLM reasoning for versatile and generalizable agent acting and planning, which also augments LLM reasoning to be more grounded and deliberate via external feedback and internal control. We show language agents can solve a diversity of language and agent tasks (especially digital automation tasks proposed in Part I), with notable improvements over prior LLM-based methods and traditional agents. Part III consolidates insights from Parts I and II and outlines a principled framework for language agents. The framework provides modular abstractions to organize various LLM-based methods as agents, to understand their gaps from human cognition, and to inspire and develop new methods towards general-purpose autonomous agents. From foundational empirical tasks and methods to a unifying conceptual framework, this thesis establishes the study of language agents as a distinct and rigorously defined field at the frontier of AI research.
dc.format.mimetypeapplication/pdf
dc.language.isoen
dc.publisherPrinceton, NJ : Princeton University
dc.subjectCognitive Science
dc.subjectDigital Automation
dc.subjectLanguage Agents
dc.subjectLanguage Models
dc.subjectNatural Language Processing
dc.subjectReinforcement Learning
dc.subject.classificationArtificial intelligence
dc.titleLanguage Agents: From Next-Token Prediction to Digital Automation
dc.typeAcademic dissertations (Ph.D.)
pu.date.classyear2024
pu.departmentComputer Science
Appears in Collections:Computer Science

Files in This Item:
File Description SizeFormat 
Yao_princeton_0181D_15086.pdf14.61 MBAdobe PDFView/Download


Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.