Skip navigation
Please use this identifier to cite or link to this item: http://arks.princeton.edu/ark:/88435/dsp01sb397c614
Title: Adventures in Synthesis: An Empirical Survey in Using Large Language Models to Train Themselves
Authors: McKenzie, Archie
Advisors: Kernighan, Brian
Department: Computer Science
Class Year: 2024
Abstract: Self-improving artificial intelligence (AI) systems are the stuff of start-up dreams and sci-fi nightmares. This paper surveys the use of synthetic data in large language models (LLMs), currently the most generally capable artificial intelligence systems. It discusses practical and theoretical problems with fine-tuning a model on synthetic data, particularly for skills like translation. Finally, it reviews the state of the AI industry in the early 2020s and considers trade-offs between general and specialized AI systems.
URI: http://arks.princeton.edu/ark:/88435/dsp01sb397c614
Type of Material: Princeton University Senior Theses
Language: en
Appears in Collections:Computer Science, 1987-2024

Files in This Item:
File SizeFormat 
MCKENZIE-ARCHIE-THESIS.pdf933.7 kBAdobe PDF    Request a copy


Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.