Skip navigation
Please use this identifier to cite or link to this item:
Title: Systems and Algorithms for High-Performance, Cost-Efficient Key-Value Storage
Authors: Li, Xiaozhou
Advisors: Freedman, Michael J
Contributors: Computer Science Department
Keywords: concurrent hash table
hardware transactional memory
key-value storage
load balancing
OpenFlow switching
Subjects: Computer science
Issue Date: 2016
Publisher: Princeton, NJ : Princeton University
Abstract: Key-value storage systems are increasingly essential building blocks of modern cloud and big data applications. The workloads these systems support often require random access to small objects over massive datasets with highly skewed and dynamic key popularity. It is challenging for a storage cluster to serve these workloads with both high performance and low-cost operations. Today's systems usually sacrifice one for the other. In this dissertation, we present novel approaches to improve both the performance and cost-efficiency of key-value systems by combining new hardware and software techniques with careful architectural design and algorithmic optimizations. First, at cluster scale, we build SwitchKV, a heterogeneous system that uses small high-end cache nodes to guarantee load balancing across many SSD-based backend nodes under nearly-arbitrary workloads. The cache nodes absorb the hottest queries so that no individual backend node is over-burdened or underutilized. The system exploits OpenFlow switches to enable efficient content-aware routing so that it can achieve scalable high throughput, low tail latency, and high availability. It uses new algorithms to keep the cache and switch forwarding rules updated with low overhead, and to ensure stable high performance under rapidly changing workloads. SwitchKV can meet the service level objectives for many cloud services more efficiently than traditional systems. Second, to improve the efficiency of each individual multi-core server, we build a high-throughput and memory-efficient concurrent hash table based around optimistic cuckoo hashing. Our re-design minimizes critical section length, reduces interprocessor coherence traffic, and enables effective prefetching through careful algorithm and data structure engineering. We explore hardware transactional memory and fine-grained locking for concurrency control, and find that both of them require the same level of algorithmic efforts to achieve high performance. Our new hash table design greatly outperforms other optimized concurrent hash tables for both read- and write-heavy workloads, even while using substantially less memory for small key-value items.
Alternate format: The Mudd Manuscript Library retains one bound copy of each dissertation. Search for these copies in the library's main catalog:
Type of Material: Academic dissertations (Ph.D.)
Language: en
Appears in Collections:Computer Science

Files in This Item:
File Description SizeFormat 
Li_princeton_0181D_11698.pdf2.02 MBAdobe PDFView/Download

Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.