Improving Accuracy of Counterfactual Estimation for Sales Forecasting Using an Ensemble of ARMA, ANN and BSTS Models

Ogutu, Lencer

Please use this identifier to cite or link to this item: http://arks.princeton.edu/ark:/88435/dsp01ft848t65w

Title:	Improving Accuracy of Counterfactual Estimation for Sales Forecasting Using an Ensemble of ARMA, ANN and BSTS Models
Authors:	Ogutu, Lencer
Advisors:	Fan, Jianqing Guerzhoy, Michael
Department:	Operations Research and Financial Engineering
Certificate Program:	Center for Statistics and Machine Learning
Class Year:	2020
Abstract:	Counterfactuals have become increasingly standard in the estimation of causal inferences mostly in fields dealing with quantitative social research. One of the biggest challenges within the realm of causal inference has been how to accurately predict the counterfactual from which causal impact can be gauged. A well developed technique that is still popularly used to assess causal impact is the differences-in-differences (D-in-D) technique which assumes a treatment and control group whose features are similar except for an intervention applied to only the treatment group. The control group works as the counterfactual in this case. Causal impact is then calculated as the difference between what is observed in the two groups. However, there are some drawbacks to this approach that have necessitated research into other techniques of causal impact analysis. These drawbacks include the expectation that market data follows ideal randomized design which is rarely the case (typically it exhibits low signal-to-noise ratio), the fact that D-in-D does not account for seasonal variations and that it is confounded by the effects of unobserved variables and their interactions.Therefore this paper proposes the exploration and use of an ensemble of Bayesian Structural Time Series models, Artificial Neural Networks and Auto Regressive Moving Average models into a singular forecasting model to predict the counterfactual and estimate more accurately the causal impact of an intervention on metrics of analysis.I fit the three models on solar products daily sales time series and use the models to make forecasts in order to calculate their Mean Absolute Percentage Error (MAPE). The analysis finds that the forecasting accuracy of an ensemble model is higher than that of all the individual models. Additionally, the ensemble constructed by a weighted average of the individual models, the weights having been determined by a regression of these three models,is more accurate compared to the ensemble built by a simple average.
URI:	http://arks.princeton.edu/ark:/88435/dsp01ft848t65w
Type of Material:	Princeton University Senior Theses
Language:	en
Appears in Collections:	Operations Research and Financial Engineering, 2000-2023

Files in This Item:

File	Description	Size	Format
OGUTU-LENCER-THESIS.pdf		3.16 MB	Adobe PDF	Request a copy

Show full item record

Search

Browse