Skip navigation
Please use this identifier to cite or link to this item: http://arks.princeton.edu/ark:/88435/dsp01n583xx224
 Title: Erasure Codes for Optimal Node Repairs in Distributed Storage Systems Authors: Goparaju, Sreechakra Advisors: Calderbank, Robert Contributors: Electrical Engineering Department Keywords: Coding TheoryDistributed StorageErasure CodesLinear AlgebraSecurity Subjects: Electrical engineeringComputer scienceApplied mathematics Issue Date: 2014 Publisher: Princeton, NJ : Princeton University Abstract: Google, Amazon, and other services store data in multiple geographically separated disks called nodes, among other reasons, to safeguard the data from node failures. Standard techniques for such a distributed way of storage include multiple backups (typically triple replication) or using erasure codes such as Reed-Solomon codes. The latter codes are the most space-efficient for a targeted worst-case number of simultaneous node failures. They are extremely inefficient how- ever for repairing the frequently occurring single node failure. Replication provides the most cost-effective repair in this scenario but ultimately is an unwise option in today's data proliferation. New erasure codes are therefore required to simultaneously optimize storage efficiency, worst-case resilience and repair costs for single node failures. This dissertation looks at two such erasure codes: regenerating codes, which optimize the communication costs, and locally repairable codes (LRCs), which optimize the I/O costs (number of nodes contacted). Regenerating codes store a file of size M on n nodes and trade-off the amount of data stored α per node for the amount of bandwidth γ used to repair a node. This dissertation presents new code constructions and thereby, state-of-the-art inner bounds for this trade-off region. A lower bound is also provided for α for codes achieving the optimal storage point in the trade-off, signifying the necessity of storing an exponential number of symbols. Ideas developed in this analysis have been applied to establish the optimal file size that can be securely stored in the presence of an eavesdropper, when the corresponding regenerating code is at the optimal storage point. Locally repairable codes, on the other hand, can be viewed as classical erasure codes of dimension k, length n, distance d and a new parameter r, called locality. In storage parlance, an LRC of locality r stores a file of size k on n nodes such that when a node fails, there exist r other nodes that suffice to reconstruct the failed node. Previous considerations on optimality have largely ignored the finite field involved. This dissertation provides codes on the binary field that optimize k for certain families of parameters n, d, and r. URI: http://arks.princeton.edu/ark:/88435/dsp01n583xx224 Alternate format: The Mudd Manuscript Library retains one bound copy of each dissertation. Search for these copies in the library's main catalog Type of Material: Academic dissertations (Ph.D.) Language: en Appears in Collections: Electrical Engineering

Files in This Item:
File Description SizeFormat
Goparaju_princeton_0181D_11181.pdf1.35 MBAdobe PDF

Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.