Please use this identifier to cite or link to this item:
http://arks.princeton.edu/ark:/88435/dsp01k0698b87r
Title: | Incorporating human plausibility in single- and multi-agent AI systems |
Authors: | Barnett, Samuel Alex |
Advisors: | Adams, Ryan P Griffiths, Tom |
Contributors: | Computer Science Department |
Subjects: | Computer science |
Issue Date: | 2024 |
Publisher: | Princeton, NJ : Princeton University |
Abstract: | As AI systems play a progressively larger role in human affairs, it becomes more important that these systems are built with insights from human behavior. In particular, models that are developed on the principle of human plausibility will more likely yield results that are more accountable and more interpretable, in a way that greater ensures an alignment between the behavior of the system and what its stakeholders want from it. In this dissertation, I will present three projects that build on the principle of human plausibility for three distinct applications: (i) Plausible representations: I present the Priority-Adjusted Reply for Successor Representations (PARSR) algorithm, a single-agent reinforcement learning algorithm that brings together the ideas of prioritisation-based replay and successor representation learning. Both of these ideas lead to a more biologically plausible algorithm that captures human-like capabilities of transferring and generalizing knowledge from previous tasks to novel, unseen ones. (ii) Plausible inference: I present a pragmatic account of the weak evidence effect, a counterintuitive phenomenon of social cognition that occurs when humans must account for persuasive goals when incorporating evidence from other speakers. This leads to a recursive, Bayesian model that encapsulates how AI systems and their human stakeholders communicate with and understand one another in a way that accounts for the vested interests that each will have. (iii) Plausible evaluation: I introduce a tractable and generalizable measure for cooperative behavior in multi-agent systems that is counterfactually contrastive, contextual, and customizable with respect to different environmental parameters. This measure can be of practical use in disambiguating between cases in which collective welfare is achieved through genuine cooperation, or by each agent acting solely in its own self-interest, both of which result in the same outcome. |
URI: | http://arks.princeton.edu/ark:/88435/dsp01k0698b87r |
Type of Material: | Academic dissertations (Ph.D.) |
Language: | en |
Appears in Collections: | Computer Science |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
Barnett_princeton_0181D_15004.pdf | 2.81 MB | Adobe PDF | View/Download |
Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.