Skip navigation
Please use this identifier to cite or link to this item:
Title: PReMVOS with ConvLSTM; Exploiting Recurrence for Video Object Segmentation
Authors: Hertan, Freddy
Advisors: Boumal, Nicolas
Jha, Niraj
Department: Mathematics
Certificate Program: Applications of Computing Program
Class Year: 2019
Abstract: In this paper, we present a new method for semi-supervised video object segmentation, PRe- MVOS with ConvLSTM. Given the first frame ground-truth label, our method automatically generates accurate and consistent pixel masks for objects in the rest of the video sequence. In producing these masks, we build heavily upon the state-of-the-art PReMVOS method that won the DAVIS 2018 Video Object Segmentation Challenge and the YouTube-VOS 1st Large-scale Video Object Segmentation Challenge. A new multi-scale convolutional LSTM (ConvLSTM) module is added to the end of Youtube-VOS version of PReMVOS in order to incorporate temporal information about mask predictions in previous frames. Because ConvLSTMs have relatively few parameters, we are able to implement this module with little overhead, adding only 0.06 seconds per frame to the run time. Our method improves the mean J score DAVIS 2017 validation set above the Youtube-VOS version of PReMVOS, and we approach the performance of the much slower DAVIS 2018 version of PReMVOS. We thus bridge the performance gap, at least in terms of J score, between the DAVIS and Youtube-VOS versions of PReMVOS without increasing run time considerably. Our method performs best in the single-object case, boosting the mean J score of the Youtube-VOS version of PReMVOS by 0.55 while only exhibiting a decrease of 0.05 in mean F score. We thus demonstrate empirically that videos contain temporal information that can be used to boost segmentation accuracy.
Type of Material: Princeton University Senior Theses
Language: en
Appears in Collections:Mathematics, 1934-2020

Files in This Item:
File Description SizeFormat 
HERTAN-FREDDY-THESIS.pdf1.47 MBAdobe PDF    Request a copy

Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.