Video Synthesis: Binary Masks to Frames via DeepInversion

Santhanam, Hari

Please use this identifier to cite or link to this item: http://arks.princeton.edu/ark:/88435/dsp01wd376034b

Full metadata record

DC Field	Value	Language
dc.contributor.advisor	Jha, Niraj
dc.contributor.author	Santhanam, Hari
dc.date.accessioned	2020-10-02T21:30:23Z	-
dc.date.available	2020-10-02T21:30:23Z	-
dc.date.created	2020-05-04
dc.date.issued	2020-10-02	-
dc.identifier.uri	http://arks.princeton.edu/ark:/88435/dsp01wd376034b	-
dc.description.abstract	Machine learning models rely on data for training, so they can help make real-world predictions. The acquisition of such training data can be arduous in certain situations. For example, some models are developed and trained using data that is privacy protected. As a result, such datasets become inaccessible to researchers seeking to train new models and make predictions. If we can recover the training data from a pre-trained model, this can greatly aid in potential knowledge transfer. In this thesis, we use a recently developed technique called DeepInversion to synthesize video training data. DeepInversion is applied to invert a Mask R-CNN architecture, in order to produce synthetic frames of videos in the DAVIS dataset. We perform input optimization from random noise to high fidelity frames. Specifically, we optimize a classification loss, defined between ground truth and predicted coarse masks, as well as auxiliary losses that minimize noise and batch normalization statistic differences. We train for 2k iterations with a learning rate of 0.1 and an Adam optimizer. The viability of our method is tested on many first frames of videos in the DAVIS set, with different auxiliary loss parameter scaling values for each frame. Finally, we synthesize many frames of the ’bear’ video and string them together to produce a synthetic video. Ideas developed in this thesis can be greatly beneficial in the domains of federated learning, privacy-protected data acquisition, and lower latency model training.
dc.format.mimetype	application/pdf
dc.language.iso	en
dc.title	Video Synthesis: Binary Masks to Frames via DeepInversion
dc.type	Princeton University Senior Theses
pu.date.classyear	2020
pu.department	Electrical Engineering
pu.pdf.coverpage	SeniorThesisCoverPage
pu.contributor.authorid	961248044
Appears in Collections:	Electrical and Computer Engineering, 1932-2023

Files in This Item:

File	Description	Size	Format
SANTHANAM-HARI-THESIS.pdf		4.63 MB	Adobe PDF	Request a copy

Show simple item record

Search

Browse