Skip navigation
Please use this identifier to cite or link to this item: http://arks.princeton.edu/ark:/88435/dsp01xd07gw11r
Title: A Stock Similarity REST API Capturing Time-Series Similarity Through Machine Learning And Data Mining Techniques
Authors: Castaneda, Emanuel
Advisors: Singh, Jaswinder Pal
Department: Computer Science
Class Year: 2016
Abstract: We present the first free and publicly available REST API service to provide stock similarity recommendations. Our similarity engine allows users the ability to query all stocks found in the S&P 500. While much research exists on similarity search of time series data, many if not all advances (specifically in the field of finance) are developed as proprietary technology and are therefore not available to the general public. Our API serves as a tool for any investor to find similar stocks through powerful machine learning algorithms. Various uses exist for stock similarity search and our results section focuses on its strong potential as a filter for stock pairs used in pairs trading. We first distinguish key statistics used when defining stock similarity. We then use data from Yahoo Finance to create Stock Vectors containing these statistics. Our similarity engine uses an Expectation Maximization Clustering Algorithm as well as a K-nearest Neighbor Algorithm to serve its various queries. The K-nearest Neighbor Algorithm solves the indexing problem for a given query stock by calculating the Euclidean distance between the two Stock Vectors. Similarity metrics (weighted correlation and a distance measure), along with the stock’s cluster association are used to calculate potential stock pairs. Our results demonstrate the strong usefulness of our API as a tool for investors to search for similar stocks. Specifically we show its efficacy as a recommendation system for stock pairs trading, to produce high levels of risk-adjusted return.
Extent: 48 pages
URI: http://arks.princeton.edu/ark:/88435/dsp01xd07gw11r
Type of Material: Princeton University Senior Theses
Language: en_US
Appears in Collections:Computer Science, 1987-2023

Files in This Item:
File SizeFormat 
Castaneda_Emanuel_thesis.pdf788.37 kBAdobe PDF    Request a copy


Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.