Skip navigation
Please use this identifier to cite or link to this item: http://arks.princeton.edu/ark:/88435/dsp01hq37vq979
Full metadata record
DC FieldValueLanguage
dc.contributor.advisorRamadge, Peter J.en_US
dc.contributor.authorWang, Yunen_US
dc.contributor.otherElectrical Engineering Departmenten_US
dc.date.accessioned2015-12-07T19:56:17Z-
dc.date.available2015-12-07T19:56:17Z-
dc.date.issued2015en_US
dc.identifier.urihttp://arks.princeton.edu/ark:/88435/dsp01hq37vq979-
dc.description.abstractRecently, the sparse representation of data with respect to a dictionary of features has contributed to successful new methods in machine learning, pattern analysis, statistics and signal processing. At the heart of many sparse representation methods is the least squares problem with l1 regularization, often called the lasso problem. Despite being studied extensively, the applicability of lasso to large-scale problems has been hindered by the expensive computational cost. This dissertation investigates feature screening for the lasso problem, targeted at the aforementioned computational issue. For a given lasso problem, screening quickly identifies a subset of features that will receive zero weight in a solution. These features can be removed from the dictionary, prior to solving the problem, without impacting the optimality of the solution obtained. This has two potential advantages: it reduces the size of the dictionary, allowing the lasso problem to be solved with less resource, and it speeds up obtaining a solution. Current classes of one-shot screening tests are based on bounding the dual lasso solution within a sphere or the intersection of a sphere and a half space. We propose an optimal screening test when the dual solution is bounded within the intersection of a sphere and two half spaces, and empirically investigate the trade-o that this test makes between screening power and computational efficiency. We then go beyond the regime of one-shot screening and examine a sequential screening scheme for one target lasso problem. Using analytical and empirical means we give insight on how the values of this sequence should be chosen and show that well designed sequential screening yields significant improvement in dictionary reduction and computational efficiency for lightly regularized lasso problems. We propose and explore a data adaptive sequential screening (DASS) scheme, which achieves state-of-the-art performance. As one application of DASS, we show how it can also facilitate faster completion of sparse representation decision tasks, such as classification, without affecting statistical accuracy. In particular, for clip-level music genre classification, our proposed method yields improved clip classification accuracy and considerable computational speedup.en_US
dc.language.isoenen_US
dc.publisherPrinceton, NJ : Princeton Universityen_US
dc.relation.isformatofThe Mudd Manuscript Library retains one bound copy of each dissertation. Search for these copies in the library's main catalog: http://catalog.princeton.edu/en_US
dc.subjectclassificationen_US
dc.subjectfeature screeningen_US
dc.subjectfeature selectionen_US
dc.subjectlassoen_US
dc.subjectmachine learningen_US
dc.subjectsparse representation/regressionen_US
dc.subject.classificationElectrical engineeringen_US
dc.subject.classificationStatisticsen_US
dc.subject.classificationComputer scienceen_US
dc.titleFeature Screening for the Lassoen_US
dc.typeAcademic dissertations (Ph.D.)en_US
pu.projectgrantnumber690-2143en_US
Appears in Collections:Electrical Engineering

Files in This Item:
File Description SizeFormat 
Wang_princeton_0181D_11528.pdf3.16 MBAdobe PDFView/Download


Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.