Please use this identifier to cite or link to this item:
http://arks.princeton.edu/ark:/88435/dsp01sb397c614
Title: | Adventures in Synthesis: An Empirical Survey in Using Large Language Models to Train Themselves |
Authors: | McKenzie, Archie |
Advisors: | Kernighan, Brian |
Department: | Computer Science |
Class Year: | 2024 |
Abstract: | Self-improving artificial intelligence (AI) systems are the stuff of start-up dreams and sci-fi nightmares. This paper surveys the use of synthetic data in large language models (LLMs), currently the most generally capable artificial intelligence systems. It discusses practical and theoretical problems with fine-tuning a model on synthetic data, particularly for skills like translation. Finally, it reviews the state of the AI industry in the early 2020s and considers trade-offs between general and specialized AI systems. |
URI: | http://arks.princeton.edu/ark:/88435/dsp01sb397c614 |
Type of Material: | Princeton University Senior Theses |
Language: | en |
Appears in Collections: | Computer Science, 1987-2024 |
Files in This Item:
File | Size | Format | |
---|---|---|---|
MCKENZIE-ARCHIE-THESIS.pdf | 933.7 kB | Adobe PDF | Request a copy |
Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.