Skip navigation
Please use this identifier to cite or link to this item: http://arks.princeton.edu/ark:/88435/dsp015h73q0433
Title: Language Agents: From Next-Token Prediction to Digital Automation
Authors: Yao, Shunyu
Advisors: Narasimhan, Karthik
Contributors: Computer Science Department
Keywords: Cognitive Science
Digital Automation
Language Agents
Language Models
Natural Language Processing
Reinforcement Learning
Subjects: Artificial intelligence
Issue Date: 2024
Publisher: Princeton, NJ : Princeton University
Abstract: Building autonomous agents to interact with the world lies at the core of artificial intelligence (AI). This thesis introduces "language agents'', a new category of agents that utilize large language models (LLMs) to reason to act, marking a departure from traditional agents via extensive rule design or learning. It is developed in three parts: Part I motivates the necessity for language agents by introducing a new set of AI problems and benchmarks based on interaction with large-scale, real-world computer environments, such as the Internet or code interfaces. These "digital automation'' tasks present tremendous value for alleviating tedious labor and improving our lives, yet pose significant challenges for prior agent or LLM methods in decision-making over open-ended natural language and long horizon, calling for new methodologies. Part II lays the methodological foundation for language agents, where the key idea is to apply LLM reasoning for versatile and generalizable agent acting and planning, which also augments LLM reasoning to be more grounded and deliberate via external feedback and internal control. We show language agents can solve a diversity of language and agent tasks (especially digital automation tasks proposed in Part I), with notable improvements over prior LLM-based methods and traditional agents. Part III consolidates insights from Parts I and II and outlines a principled framework for language agents. The framework provides modular abstractions to organize various LLM-based methods as agents, to understand their gaps from human cognition, and to inspire and develop new methods towards general-purpose autonomous agents. From foundational empirical tasks and methods to a unifying conceptual framework, this thesis establishes the study of language agents as a distinct and rigorously defined field at the frontier of AI research.
URI: http://arks.princeton.edu/ark:/88435/dsp015h73q0433
Type of Material: Academic dissertations (Ph.D.)
Language: en
Appears in Collections:Computer Science

Files in This Item:
File Description SizeFormat 
Yao_princeton_0181D_15086.pdf14.61 MBAdobe PDFView/Download


Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.