Entropic stochastic search for expensive, unimodal functions and its application to stochastic gradient algorithms and the optimization of parameterized policies for supply chain planning

Luo, Xiaohe

Please use this identifier to cite or link to this item: http://arks.princeton.edu/ark:/88435/dsp012r36v183w

Title:	Entropic stochastic search for expensive, unimodal functions and its application to stochastic gradient algorithms and the optimization of parameterized policies for supply chain planning
Authors:	Luo, Xiaohe
Advisors:	Powell, Warren
Contributors:	Operations Research and Financial Engineering Department
Keywords:	Bayesian optimization black-box optimization reinforcement learning stochastic optimization supply chain management
Subjects:	Operations research Applied mathematics
Issue Date:	2023
Publisher:	Princeton, NJ : Princeton University
Abstract:	There are many problems that require optimizing expensive, stochastic functions (such as computationally demanding simulations) by tuning one or a small number of parameters. One application is the determination of stepsizes in stochastic gradient algorithms, where the research literature uniformly uses simple analytical functions which introduce their own tunable parameters. We approach this problem using a stochastic adaptation of Fibonacci search where we assume unimodality to successively revise our belief about the location of the optimum after each function evaluation, providing reasonable estimates of the optimum in as little as six iterations. This idea introduces the possibility of replacing analytical formulas for stepsizes in stochastic gradient algorithms with a stochastic search procedure that represents a form of lookahead policy. These formulas exhibit good asymptotic performance but can work very poorly when optimizing expensive functions which limit the number of iterations. While computationally more expensive than computing an analytical formula, it produces stepsizes that are much more accurate than the standard analytical formulas which also require their own tuning. The result is much faster convergence, and convergence that is more robust across different datasets than simple analytical formulas. We then apply these results to the problem of designing inventory ordering policies in a complex supply chain problem involving the ordering of parts for PV modules for solar panels. In contrast with the massive literature on stylized inventory problems, widely featured in inventory texts today, we address a rich family of inventory problems. We created a variety of settings, from simple stationary problems with no lead times to complex problems with long lead times and nonstationary, rolling forecasts. We then tune and compare a family of inventory policies, from basic (s,S) parameterized policies to full stochastic lookahead policies using parameterized rules for the lookahead policy (a first for this problem class). We show that each of four different classes of policies, with properly tuned parameters, is optimal on at least one of the test problems.
URI:	http://arks.princeton.edu/ark:/88435/dsp012r36v183w
Type of Material:	Academic dissertations (Ph.D.)
Language:	en
Appears in Collections:	Operations Research and Financial Engineering

Files in This Item:

File	Description	Size	Format
Luo_princeton_0181D_14672.pdf		2.68 MB	Adobe PDF	View/Download

Show full item record

Search

Browse