Skip navigation
Please use this identifier to cite or link to this item: http://arks.princeton.edu/ark:/88435/dsp019306t259x
Title: A Data Mining and Machine Learning Approach to Private Equity Replication in Public Markets
Authors: Isichei, Ifeanyi
Advisors: Scheinerman, Daniel
Department: Operations Research and Financial Engineering
Certificate Program: Applications of Computing Program
Class Year: 2023
Abstract: The private equity asset class has generated attractive returns in recent decades and has consistently outperformed the public markets, thus becoming a key element in the portfolios of large institutional investors. However, the illiquidity of the asset class and its inaccessibility to smaller, resource-constrained investors makes it intuitively appealing to construct a public equity portfolio that generates private equity returns. This thesis attempts do so, using data mining and machine learning methods to construct portfolios of small value equities that achieve top-end private equity performance. We first employ a data mining approach to systematically construct a universe of over 20,000 fundamental signals from financial statements. We then select the subset of these signals with the most predictive power and use these as inputs for lasso regression, random forests, and extreme gradient boosting models. We use these models to construct ranking systems of stocks, and find that the extreme gradient boosting and random forests models have particularly strong predictive power. Annual equal-weighted portfolios of the top 25 stocks from the extreme gradient boosting and random forests models generate annualized returns of 24.5% and 21.0% respectively from July 1982 to June 2022, greatly exceeding broader mar- ket returns and private equity benchmarks over the same time frame. Our factor and benchmark analyses confirm this outperformance. Our top portfolios also generate attractive downside-risk-adjusted returns, with Sortino ratios of 1.43 and 1.19 respectively. Thus, given this strong performance, and the fact that neither a data mining approach nor the selected machine learning models have been used in prior private equity replication literature, our results provide a meaningful contribution to the existing corpus.
URI: http://arks.princeton.edu/ark:/88435/dsp019306t259x
Type of Material: Princeton University Senior Theses
Language: en
Appears in Collections:Operations Research and Financial Engineering, 2000-2023

Files in This Item:
File Description SizeFormat 
ISICHEI-IFEANYI-THESIS.pdf1.17 MBAdobe PDF    Request a copy


Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.