Please use this identifier to cite or link to this item:
|Title:||New Techniques for Learning and Inference in Bayesian Models|
|Contributors:||Computer Science Department|
method of moments
|Publisher:||Princeton, NJ : Princeton University|
|Abstract:||A common theme in machine learning is succinct modeling of distributions over large domains. Probabilistic graphical models are one of the most expressive frameworks for doing this. The two major tasks involving graphical models are learning and inference. Learning is the task of calculating the best fit graphical model from raw data, while inference is the task of answering probabilistic queries for a known graphical model (for example what is the marginal distribution of one of the variables or what is the distribution of a subset of variables, after conditioning on the values of some other subset of variables). Learning can be thought of as finding a graphical model that “explains” the raw data, while the inference queries extract the “knowledge” the graphical model contains. This thesis introduces new provable techniques for performing both of these tasks, in the context of both latent-variable models -- in which a portion of the variables in the graphical model are not observed, as well as fully-observable undirected graphical models (Markov Random Fields). Chapters 2 and 3 will focus on learning latent-variable models, while Chapter 4 will focus on inference in Markov Random Fields. In Chapter 2, I will contribute the first provable results for analyzing variational Bayes: a family of alternating-minimization style algorithms which is very popular in practice for learning latent-variable models. Despite it's popularity with practitioners, the only theoretical guarantees prior to this work concerned convergence to local minima. We will prove that under reasonable assumptions, in the context of topic models, these algorithms will converge to the global minimum. Subsequently, in Chapter 3, we will use the method-of-moments along with new techniques in tensor decomposition and constrained matrix factorization to derive algorithms for learning noisy-OR networks -- the textbook example of a probabilistic model for causal relationships. Importantly, these techniques were only applicable to linear latent-variable models -- which noisy-OR is not. In Chapter 4, I will contribute a new understanding of a class of variational methods for calculating partition functions in Markov Random Fields. The key technical ingredient is a connection to convex programming hierarchies -- a recent area of interest in combinatorial optimization, along with approximations of the entropy of a distribution based on low-order moment information.|
|Alternate format:||The Mudd Manuscript Library retains one bound copy of each dissertation. Search for these copies in the library's main catalog: catalog.princeton.edu|
|Type of Material:||Academic dissertations (Ph.D.)|
|Appears in Collections:||Computer Science|
Files in This Item:
There are no files associated with this item.
Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.