Please use this identifier to cite or link to this item:
|Title:||Essays on Data Integration in the Social Sciences|
|Publisher:||Princeton, NJ : Princeton University|
|Abstract:||This dissertation is a collection of three essays that deal with the challenges social scientists face when trying to integrate information from multiple sources. The first chapter focuses on technical aspects of data integration. Specifically, when a unique identifier that unambiguously links records is not available, merging datasets can be a difficult task. Probabilistic record linkage (PRL) aims to solve this problem by providing a principled framework. I propose an active learning algorithm for PRL, which incorporates human judgment. Using data where a unique identifier is available for validation, I find that the proposed method bolsters the accuracy of the merging process. In addition, I show that the proposed method can recover estimates that are indistinguishable from those obtained from a more extensive and time-consuming manual review. The second chapter uses data from multiple sources to study the impact of electoral rules on coalition building. To overcome endogeneity concerns, I exploit an arbitrary change in electoral rules across municipalities in Brazil and study its impact on pre-electoral coalitions (PECs). I find that in municipalities where elections are conducted using a dual-ballot system, the median number of parties in a given PEC is smaller than in those municipalities that use a single-ballot system. In addition, this reduction in the size of PECs is associated with an increase in the total number of candidates. These findings are consistent with theories that emphasize the opportunities for strategic behavior under a dual-ballot system. Finally, the third chapter (co-authored with Svetlana Kosterina) combines information from multiple sources to study ethnic voting. We show that local ethnic geography affects ethnic voting by incentivizing voters of an ethnicity that finds itself in the minority to misrepresent its preferences. We provide empirical evidence for our claim using the data from the Afrobarometer survey in Ghana to measure the voters' beliefs that they are likely to face adverse consequences for expressing their political preferences. Using geocoded data from Afrobarometer, as well as data from the Ghana Demographic and Health Survey, we find no evidence for local public goods provision as an alternative mechanism.|
|Alternate format:||The Mudd Manuscript Library retains one bound copy of each dissertation. Search for these copies in the library's main catalog: catalog.princeton.edu|
|Type of Material:||Academic dissertations (Ph.D.)|
|Appears in Collections:||Politics|
Files in This Item:
This content is embargoed until 2021-04-15. For more information contact the Mudd Manuscript Library.
Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.