Please use this identifier to cite or link to this item:
http://arks.princeton.edu/ark:/88435/dsp01mg74qq49s
Title: | Exploring LLM Insight Discovery: InsightBench and MCMC Prompting |
Authors: | Chhabria, Sabhya |
Advisors: | Chen, Danqi |
Department: | Computer Science |
Class Year: | 2024 |
Publisher: | Princeton, NJ : Princeton University |
Abstract: | Large Language Models (LLMs) have become increasingly adept at existing bench- marks that seek to evaluate their abilities on traditional tasks such as language under- standing, reasoning, programming and math questions. In this work, we introduce InsightBench, a new benchmark that seeks to evaluate LLMs on insight problems. Insight problems are those problems where the path to a solution is not immediately obvious. They require some form of creative thinking and the answer usually strikes in an ”aha” moment. In this work, we evaluate several popular LLMs (both open and closed sourced) of varying sizes on InsightBench. Our results are able to establish that insight problems (creative problem solving) remain a critical failure point of language models. We further try to bolster LLM performance on certain InsightBench tasks by using Markov Chain Monte Carlo prompting schemes, allowing LLMs to explore large search spaces to solve insight problems. |
URI: | http://arks.princeton.edu/ark:/88435/dsp01mg74qq49s |
Type of Material: | Academic dissertations (M.S.E.) |
Language: | en |
Appears in Collections: | Computer Science, 2023 |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
Chhabria_princeton_0181G_15028.pdf | 2.27 MB | Adobe PDF | View/Download |
Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.