Skip navigation
Please use this identifier to cite or link to this item: http://arks.princeton.edu/ark:/88435/dsp01mg74qq49s
Title: Exploring LLM Insight Discovery: InsightBench and MCMC Prompting
Authors: Chhabria, Sabhya
Advisors: Chen, Danqi
Department: Computer Science
Class Year: 2024
Publisher: Princeton, NJ : Princeton University
Abstract: Large Language Models (LLMs) have become increasingly adept at existing bench- marks that seek to evaluate their abilities on traditional tasks such as language under- standing, reasoning, programming and math questions. In this work, we introduce InsightBench, a new benchmark that seeks to evaluate LLMs on insight problems. Insight problems are those problems where the path to a solution is not immediately obvious. They require some form of creative thinking and the answer usually strikes in an ”aha” moment. In this work, we evaluate several popular LLMs (both open and closed sourced) of varying sizes on InsightBench. Our results are able to establish that insight problems (creative problem solving) remain a critical failure point of language models. We further try to bolster LLM performance on certain InsightBench tasks by using Markov Chain Monte Carlo prompting schemes, allowing LLMs to explore large search spaces to solve insight problems.
URI: http://arks.princeton.edu/ark:/88435/dsp01mg74qq49s
Type of Material: Academic dissertations (M.S.E.)
Language: en
Appears in Collections:Computer Science, 2023

Files in This Item:
File Description SizeFormat 
Chhabria_princeton_0181G_15028.pdf2.27 MBAdobe PDFView/Download


Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.