Please use this identifier to cite or link to this item:
|Title:||Extending Classical Deep Reinforcement Learning Techniques for use in Multi-Agent Systems|
|Abstract:||Multi-agent reinforcement learning is an attractive and challenging prospect. Multi-agent settings are non-stationary by nature, and multi-agent contributions to reward introduce ambiguity not present in single-agent learning. In this paper, we start by looking at Policy Gradient based reinforcement learning techniques in a single-agent setting. We then show why these techniques fail in a multi-agent setting and suggest improvements to the algorithms to address these problems. Experiments are done on simulations of cyclists, with the aim of agents learning to work together to cycle efficiently as a pack. The improvements suggested include: adding a varying normaliser to deal with state distribution drift; using a value function which takes as input the actions of other agents to deal with non-stationarity; and an automatic tuning algorithm for learning hyperparameters. We find that our techniques show improvements on the traditional algorithms, and suggest a more complex environment and improvements on our algorithms to further demonstrate the advantages of the approach.|
|Type of Material:||Princeton University Senior Theses|
|Appears in Collections:||Electrical Engineering, 1932-2020|
Files in This Item:
|MATTHEWS-OLIVER-THESIS.pdf||9.17 MB||Adobe PDF||Request a copy|
Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.