Skip navigation
Please use this identifier to cite or link to this item:
Full metadata record
DC FieldValueLanguage
dc.contributor.advisorCalderbank, Roberten_US
dc.contributor.authorGoparaju, Sreechakraen_US
dc.contributor.otherElectrical Engineering Departmenten_US
dc.description.abstractGoogle, Amazon, and other services store data in multiple geographically separated disks called nodes, among other reasons, to safeguard the data from node failures. Standard techniques for such a distributed way of storage include multiple backups (typically triple replication) or using erasure codes such as Reed-Solomon codes. The latter codes are the most space-efficient for a targeted worst-case number of simultaneous node failures. They are extremely inefficient how- ever for repairing the frequently occurring single node failure. Replication provides the most cost-effective repair in this scenario but ultimately is an unwise option in today's data proliferation. New erasure codes are therefore required to simultaneously optimize storage efficiency, worst-case resilience and repair costs for single node failures. This dissertation looks at two such erasure codes: regenerating codes, which optimize the communication costs, and locally repairable codes (LRCs), which optimize the I/O costs (number of nodes contacted). Regenerating codes store a file of size M on n nodes and trade-off the amount of data stored α per node for the amount of bandwidth γ used to repair a node. This dissertation presents new code constructions and thereby, state-of-the-art inner bounds for this trade-off region. A lower bound is also provided for α for codes achieving the optimal storage point in the trade-off, signifying the necessity of storing an exponential number of symbols. Ideas developed in this analysis have been applied to establish the optimal file size that can be securely stored in the presence of an eavesdropper, when the corresponding regenerating code is at the optimal storage point. Locally repairable codes, on the other hand, can be viewed as classical erasure codes of dimension k, length n, distance d and a new parameter r, called locality. In storage parlance, an LRC of locality r stores a file of size k on n nodes such that when a node fails, there exist r other nodes that suffice to reconstruct the failed node. Previous considerations on optimality have largely ignored the finite field involved. This dissertation provides codes on the binary field that optimize k for certain families of parameters n, d, and r.en_US
dc.publisherPrinceton, NJ : Princeton Universityen_US
dc.relation.isformatofThe Mudd Manuscript Library retains one bound copy of each dissertation. Search for these copies in the <a href=> library's main catalog </a>en_US
dc.subjectCoding Theoryen_US
dc.subjectDistributed Storageen_US
dc.subjectErasure Codesen_US
dc.subjectLinear Algebraen_US
dc.subject.classificationElectrical engineeringen_US
dc.subject.classificationComputer scienceen_US
dc.subject.classificationApplied mathematicsen_US
dc.titleErasure Codes for Optimal Node Repairs in Distributed Storage Systemsen_US
dc.typeAcademic dissertations (Ph.D.)en_US
Appears in Collections:Electrical Engineering

Files in This Item:
File Description SizeFormat 
Goparaju_princeton_0181D_11181.pdf1.35 MBAdobe PDFView/Download

Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.