Skip navigation
Please use this identifier to cite or link to this item: http://arks.princeton.edu/ark:/88435/dsp017d278w81g
Title: Rethinking the Science of Statistical Privacy
Authors: Liu, Changchang
Advisors: Mittal, Prateek
Contributors: Electrical Engineering Department
Keywords: Auxiliary Information
Differential Privacy
Statistical Privacy
Subjects: Computer science
Statistics
Issue Date: 2019
Publisher: Princeton, NJ : Princeton University
Abstract: Nowadays, more and more data, such as social network data, mobility data, business data, medical data, are shared or made public to enable real world applications. Such data is likely to contain sensitive information and thus needs to be obfuscated prior to release, to protect privacy. However, existing statistical data privacy mechanisms in the security community have several weaknesses: 1) they are limited in protecting sensitive information in the static scenario, and can not be generally applied to accommodate temporal dynamics. With the increasing development of data science, a large amount of sensitive data such as personal social relationships are becoming public, making the privacy concerns of a time series of data more and more challenging; 2) these privacy mechanisms do not explicitly capture correlations, leaving open the possibility of inference attacks. In many real world scenarios, the data tuple dependence/ correlation occurs naturally in datasets due to social, behavioral and genetic interactions between users; 3) there are very few practical guidelines on how to apply existing statistical privacy notions in practice, and a key challenge is how to set an appropriate value for the privacy parameters. In this thesis, we aim to overcome these weaknesses to provide privacy guarantees for protecting dynamic data structures, dependent (correlated) data structures. We also aim to discover useful and interpretable guidelines for selecting proper values of parameters in the state-of-the-art privacy-preserving frameworks. Furthermore, we investigate how an auxiliary information -- in the form of prior distribution of the database and correlation across records and time -- can influence the proper choice of the privacy parameters. Specifically, we 1) first propose the design of a privacy-preserving system called LinkMirage, that mediates access to dynamic social relationships in social networks, while effectively supporting social graph-based data analytics; 2) explicitly incorporate structural properties of data into current differential privacy metrics and mechanisms, to enable privacy-preserving data analytics for dependent/correlated data; and 3) finally provide a quantitative analysis of how hypothesis testing can guide the choice of the privacy parameters in an interpretable manner for differential privacy and other statistical privacy frameworks. Overall, our work aims to place the field of statistical data privacy on a firm analytic foundation that is coupled with the design of practical systems.
URI: http://arks.princeton.edu/ark:/88435/dsp017d278w81g
Alternate format: The Mudd Manuscript Library retains one bound copy of each dissertation. Search for these copies in the library's main catalog: catalog.princeton.edu
Type of Material: Academic dissertations (Ph.D.)
Language: en
Appears in Collections:Electrical Engineering

Files in This Item:
File Description SizeFormat 
Liu_princeton_0181D_12887.pdf4.04 MBAdobe PDFView/Download


Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.