Skip navigation
Please use this identifier to cite or link to this item:
Title: Applications of Statistical Learning to Identify Risk Factors in Student Loan Default and Repayment
Authors: Haile, Elizabeth
Advisors: Wang, Mengdi
Department: Operations Research and Financial Engineering
Certificate Program: Applications of Computing Program
Class Year: 2019
Abstract: The United States is currently facing a growing student loan crisis. In a recent report by the Federal Reserve Bank of New York’s Center for Microeconomic Data, student loan debt was reported to total over $1.4 trillion, making student debt the second largest source of consumer debt in the United States (Federal Reserve Bank of New York, 2019). Building off the work of Luo et al. (2018) and others, we use statistical learning techniques to determine key institutional and student features associated with higher levels of student loan default and nonrepayment. Our data source is the College Scorecard, released by the Department of Education initially in 2015 and most recently updated in October of 2018; the College Scorecard provides information on academic programming, admissions, financial aid, student completion, debt, and post-graduate earnings for colleges and universities across the United States from 1996 through 2017. Specifically, we utilize graphical modeling, spectral clustering, and penalized regression to produce sparse models which provide insight into relationships between attributes and identify a subset of features predictive of default and nonrepayment.
Type of Material: Princeton University Senior Theses
Language: en
Appears in Collections:Operations Research and Financial Engineering, 2000-2020

Files in This Item:
File Description SizeFormat 
HAILE-ELIZABETH-THESIS.pdf2.49 MBAdobe PDF    Request a copy

Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.