Skip navigation
Please use this identifier to cite or link to this item:
Title: Lookahead Approximations for Online Learning with Nonlinear Parametric Belief Models
Authors: Han, Weidong
Advisors: Powell, Warren B
Contributors: Operations Research and Financial Engineering Department
Keywords: Advertisement auctions
Dynamic programming
Multi-armed bandits
Online learning
Optimal learning
Value of information
Subjects: Operations research
Issue Date: 2019
Publisher: Princeton, NJ : Princeton University
Abstract: We consider sequential online learning problems where the response surface is described by a nonlinear parametric model. We adopt a sampled belief model which we refer to as a discrete prior. We propose multi-period lookahead policies to overcome the non-concavity in the value of information. For an infinite-horizon problem with discounted cumulative rewards, we prove asymptotic convergence properties under the proposed policies. Forfinite-horizon problem with undiscounted reward, we analyze the proposed policies through empirical studies in three different settings: a health setting where we make medical decisions to maximize health care response over time, a dynamic pricing setting where we make pricing decisions to maximize the cumulative revenue, and a clinical pharmacology setting where we make dosage controls to minimize the deviation between actual and target effects. We also apply the modelling framework to a real world bidding problem in online advertisement auctions, and formulate it into a finite-horizon state-dependent learning problem, where we have to maximize ad-clicks while learning from noisy responses within a budget constraint. We demonstrate that the multi-period lookahead policies perform competitively against other state-of-the-art policies.
Alternate format: The Mudd Manuscript Library retains one bound copy of each dissertation. Search for these copies in the library's main catalog:
Type of Material: Academic dissertations (Ph.D.)
Language: en
Appears in Collections:Operations Research and Financial Engineering

Files in This Item:
File Description SizeFormat 
Han_princeton_0181D_12996.pdf2.46 MBAdobe PDFView/Download

Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.