Skip navigation
Please use this identifier to cite or link to this item:
Title: Erasure Codes for Optimal Node Repairs in Distributed Storage Systems
Authors: Goparaju, Sreechakra
Advisors: Calderbank, Robert
Contributors: Electrical Engineering Department
Keywords: Coding Theory
Distributed Storage
Erasure Codes
Linear Algebra
Subjects: Electrical engineering
Computer science
Applied mathematics
Issue Date: 2014
Publisher: Princeton, NJ : Princeton University
Abstract: Google, Amazon, and other services store data in multiple geographically separated disks called nodes, among other reasons, to safeguard the data from node failures. Standard techniques for such a distributed way of storage include multiple backups (typically triple replication) or using erasure codes such as Reed-Solomon codes. The latter codes are the most space-efficient for a targeted worst-case number of simultaneous node failures. They are extremely inefficient how- ever for repairing the frequently occurring single node failure. Replication provides the most cost-effective repair in this scenario but ultimately is an unwise option in today's data proliferation. New erasure codes are therefore required to simultaneously optimize storage efficiency, worst-case resilience and repair costs for single node failures. This dissertation looks at two such erasure codes: regenerating codes, which optimize the communication costs, and locally repairable codes (LRCs), which optimize the I/O costs (number of nodes contacted). Regenerating codes store a file of size M on n nodes and trade-off the amount of data stored α per node for the amount of bandwidth γ used to repair a node. This dissertation presents new code constructions and thereby, state-of-the-art inner bounds for this trade-off region. A lower bound is also provided for α for codes achieving the optimal storage point in the trade-off, signifying the necessity of storing an exponential number of symbols. Ideas developed in this analysis have been applied to establish the optimal file size that can be securely stored in the presence of an eavesdropper, when the corresponding regenerating code is at the optimal storage point. Locally repairable codes, on the other hand, can be viewed as classical erasure codes of dimension k, length n, distance d and a new parameter r, called locality. In storage parlance, an LRC of locality r stores a file of size k on n nodes such that when a node fails, there exist r other nodes that suffice to reconstruct the failed node. Previous considerations on optimality have largely ignored the finite field involved. This dissertation provides codes on the binary field that optimize k for certain families of parameters n, d, and r.
Alternate format: The Mudd Manuscript Library retains one bound copy of each dissertation. Search for these copies in the library's main catalog
Type of Material: Academic dissertations (Ph.D.)
Language: en
Appears in Collections:Electrical Engineering

Files in This Item:
File Description SizeFormat 
Goparaju_princeton_0181D_11181.pdf1.35 MBAdobe PDFView/Download

Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.