Please use this identifier to cite or link to this item: http://arks.princeton.edu/ark:/88435/dsp0112579s300
 Title: ESSAYS ON MEASUREMENT ERROR IN FINITE-VALUED VARIABLES Authors: Gawade, Nandita Gopalkrishna Advisors: Honore, Bo E Contributors: Economics Department Subjects: Economics Issue Date: 2012 Publisher: Princeton, NJ : Princeton University Abstract: This dissertation examines the nonparametric identification of parameters of interest in measurement error models when the mismeasured variable is finite-valued. The maintained assumption is that two or more observed variables are independent conditional on the unobserved variable that is misclassified. This conditional independence assumption is the generalization to a nonparametric setting of a common exclusion restriction used in the linear errors-in-variables (EIV) model. For example, the classical EIV model in a linear regression context assumes that measurement error is uncorrelated with the true regressor and with the error in the regression equation. The second assumption means that the measurement error is uncorrelated with the outcome once we have controlled for the true regressor. In a nonlinear model with nonadditive errors, we would require that the regression measurement error and the outcome are independent conditional on the true regressor. The conditional independence assumption, although strong, is common in practice. The popularity of this assumption is in part due to the fact that, when there are three or more observed variables, conditional independence leads to nonparametric point identification of the joint distribution of observable variables and the latent variable. In this dissertation, we use the conditional independence assumption to frame the problem as a finite mixture model in which the components are product distributions. The first chapter is concerned with partial identification in measurement error models when the mismeasured variable is discrete. In particular, we are interested in the joint distribution of an outcome Y and a regressor X*, but we observe the joint distribution of Y and a surrogate X. We explore the implications of assuming nondifferential measurement error or conditional independence of the outcome and the surrogate given the true regressor. The conditional independence assumption results in a factoring of the joint distribution of the observables. The chief observation of this paper is that, when all variables are discrete, this is a problem of matrix factorization. In particular, we obtain a complete description of the identified set when both X and X* are binary and improve on the bounds existing in the literature. In contrast, the identified set can be very complicated when X* can take three or more values. In the second chapter, we examine the identified set for the nonparametric regression of continuous Y on misclassified binary X*. The results obtained in Chapter 1 continue to hold even when Y is continuously distributed. The identified set can be completely characterized by two scalar parameters, namely the infimum and supremum of the ratio of observed densities of Y conditional on the two values of X. We also consider a simple strategy for estimating these two parameters, and examine the properties of this strategy using a simulation study. The third chapter is a note on the point identification obtained when we add a third observed variable to the measurement error model described in Chapter 1. We are now concerned with the problem of decomposing a three-dimensional tensor. We show that the point identification obtained in this case is related to the uniqueness of tensor decompositions, first proved by Kruskal (1977). In the final chapter, we observe that a K-variate finite mixture of product distributions with R components can be represented by the stochastic factorization of an order-K tensor into R rank-1 tensors. When the number of components R is not known, an object of interest is the smallest number of components that is consistent with the observed distribution. We show that this number is equal to the nonnegative rank of the order-K tensor representing the observed distribution. We find lower bounds for the nonnegative tensor rank that are based on the matrix rank of flattened versions of the tensor. In general, these bounds are not sharp. We obtain sufficient conditions under which these bounds are attained, and provide examples of probability arrays where these bounds are strictly lower than the number of components. URI: http://arks.princeton.edu/ark:/88435/dsp0112579s300 Alternate format: The Mudd Manuscript Library retains one bound copy of each dissertation. Search for these copies in the library's main catalog Type of Material: Academic dissertations (Ph.D.) Language: en Appears in Collections: Economics

Files in This Item:
File Description SizeFormat