Please use this identifier to cite or link to this item:
http://arks.princeton.edu/ark:/88435/dsp01w3763b12c
Title: | Enhancing Automated Theorem Proving in Coq with Specialized Large Language Models and Tree of Thoughts |
Authors: | Maloney, Pierce |
Advisors: | Kincaid, Zachary |
Department: | Computer Science |
Certificate Program: | Center for Statistics and Machine Learning |
Class Year: | 2024 |
Abstract: | This thesis explores the enhancement of formal theorem proving through the integration of Large Language Models (LLMs) within the Coq proof assistant environment. This work developed and evaluated a fine-tuned LLM capable of generating Coq code, trained to learn the relationship between tactics and proof states. It applied a Tree of Thoughts (ToT) reasoning framework to facilitate the exploration of possible proof paths, combined with the feedback from interaction with the Coq proof assistant at each step. The evaluation was run on three models – Llemma-7b, Llemma-7b fine-tuned, and GPT-4 – across 572 proofs from the CoqGym validation dataset, using GPT-4 as the state evaluator. A* and greedy search strategies over the ToT were compared to see the relative strengths of each. The results demonstrate that the A* search strategy outperforms the greedy approach as the proof tree grows, suggesting that the strategy's exploratory nature leads to more effective proof completion over time. Additionally, the fine-tuned model performed the best on a large number of proofs, yet the results are not substantial enough to be conclusive. Moreover, though the fine-tuned model had the highest overall proof success rate, the base Llemma} model generated legal tactics more frequently than both the fine-tuned model and GPT-4. This project highlights the efficacy of combining specialized LLMs and search strategies in enhancing the performance of automated theorem-proving systems. Code available on \href{https://github.com/piercemaloney/senior-thesis}{GitHub} and dataset and models at \href{https://huggingface.co/piercemaloney}{Hugging Face}. |
URI: | http://arks.princeton.edu/ark:/88435/dsp01w3763b12c |
Type of Material: | Princeton University Senior Theses |
Language: | en |
Appears in Collections: | Computer Science, 1987-2024 |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
MALONEY-PIERCE-THESIS.pdf | 955.65 kB | Adobe PDF | Request a copy |
Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.