Please use this identifier to cite or link to this item:
http://arks.princeton.edu/ark:/88435/dsp0147429d27x
Full metadata record
DC Field | Value | Language |
---|---|---|
dc.contributor.advisor | Engelhardt, Barbara E | |
dc.contributor.author | Martinet, Guillaume Gaetan | |
dc.contributor.other | Operations Research and Financial Engineering Department | |
dc.date.accessioned | 2022-02-11T21:31:04Z | - |
dc.date.available | 2022-02-11T21:31:04Z | - |
dc.date.created | 2021-01-01 | |
dc.date.issued | 2021 | |
dc.identifier.uri | http://arks.princeton.edu/ark:/88435/dsp0147429d27x | - |
dc.description.abstract | This thesis presents several new ideas about transfer learning, treatment effect estimation and causal discovery. Our contributions are in part of a theoretical nature, in part algorithmic, but all rely in their development on a common property: The existence of an invariant mechanism, either between source and target distributions – in the case of transfer learning – or between observational and interventional data – in the case of causal inference. We clarify these notions below. The first chapter introduces a new minimax analysis of transfer learning in a nonparametric classification setting. More precisely, we study the problem of learning a classifier meant to generalize well under a target data distribution Q while most of the labeled data is from a different but related source distribution P. The invariant mechanism is the conditional distribution of the label given the covariates, which remains identical between P and Q – this assumption is often termed covariate-shift. We derive a new notion – the transfer exponent γ – that accurately characterizes the difficulty of transfer between P and Q in terms of achievable minimax rates. We also show that a recent semi-supervised k-NN algorithm can be refined to adapt to unknown γ, while requesting labels of target data only when beneficial. Then, we show in the second chapter that balance-weighting approaches to treatment effect estimation can be restated as discrepancy minimization problems under the covariate-shift assumption. While balance-weighting methods have mainly focused on binary treatments, such considerations offer a new perspective on how to generalize these approaches to continuous and multivariate treatments. In the last chapter, we address the more qualitative problem of recovering the direct causes of a target variable using data from different experimental settings. We show that a recent and already influential method, called invariant causal prediction (ICP), admits a much more efficient formulation. Our reformulation consists in performing a series of nonparametric tests based on the minimization of a new loss function – named Wasserstein variance – that we derived from optimal transport theory; while ICP’s runtime scales exponentially in the number of variables, our approach only scales linearly. We establish our method’s performance both theoretically and empirically. | |
dc.format.mimetype | application/pdf | |
dc.language.iso | en | |
dc.publisher | Princeton, NJ : Princeton University | |
dc.relation.isformatof | The Mudd Manuscript Library retains one bound copy of each dissertation. Search for these copies in the library's main catalog: <a href=http://catalog.princeton.edu>catalog.princeton.edu</a> | |
dc.subject | Causal Discovery | |
dc.subject | Causal Inference | |
dc.subject | Machine Learning | |
dc.subject | Transfer Learning | |
dc.subject.classification | Statistics | |
dc.title | Invariant Mechanisms in Transfer Learning and Causal Inference : Some Theoretical Perspectives and Algorithms. | |
dc.type | Academic dissertations (Ph.D.) | |
pu.date.classyear | 2021 | |
pu.department | Operations Research and Financial Engineering | |
Appears in Collections: | Operations Research and Financial Engineering |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
Martinet_princeton_0181D_13929.pdf | 4.52 MB | Adobe PDF | View/Download |
Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.