Systems and Algorithms for High-Performance, Cost-Efficient Key-Value Storage

Li, Xiaozhou

Please use this identifier to cite or link to this item: http://arks.princeton.edu/ark:/88435/dsp0147429c55w

Full metadata record

DC Field	Value	Language
dc.contributor.advisor	Freedman, Michael J	-
dc.contributor.author	Li, Xiaozhou	-
dc.contributor.other	Computer Science Department	-
dc.date.accessioned	2016-06-09T15:00:47Z	-
dc.date.available	2016-06-09T15:00:47Z	-
dc.date.issued	2016	-
dc.identifier.uri	http://arks.princeton.edu/ark:/88435/dsp0147429c55w	-
dc.description.abstract	Key-value storage systems are increasingly essential building blocks of modern cloud and big data applications. The workloads these systems support often require random access to small objects over massive datasets with highly skewed and dynamic key popularity. It is challenging for a storage cluster to serve these workloads with both high performance and low-cost operations. Today's systems usually sacrifice one for the other. In this dissertation, we present novel approaches to improve both the performance and cost-efficiency of key-value systems by combining new hardware and software techniques with careful architectural design and algorithmic optimizations. First, at cluster scale, we build SwitchKV, a heterogeneous system that uses small high-end cache nodes to guarantee load balancing across many SSD-based backend nodes under nearly-arbitrary workloads. The cache nodes absorb the hottest queries so that no individual backend node is over-burdened or underutilized. The system exploits OpenFlow switches to enable efficient content-aware routing so that it can achieve scalable high throughput, low tail latency, and high availability. It uses new algorithms to keep the cache and switch forwarding rules updated with low overhead, and to ensure stable high performance under rapidly changing workloads. SwitchKV can meet the service level objectives for many cloud services more efficiently than traditional systems. Second, to improve the efficiency of each individual multi-core server, we build a high-throughput and memory-efficient concurrent hash table based around optimistic cuckoo hashing. Our re-design minimizes critical section length, reduces interprocessor coherence traffic, and enables effective prefetching through careful algorithm and data structure engineering. We explore hardware transactional memory and fine-grained locking for concurrency control, and find that both of them require the same level of algorithmic efforts to achieve high performance. Our new hash table design greatly outperforms other optimized concurrent hash tables for both read- and write-heavy workloads, even while using substantially less memory for small key-value items.	-
dc.language.iso	en	-
dc.publisher	Princeton, NJ : Princeton University	-
dc.relation.isformatof	The Mudd Manuscript Library retains one bound copy of each dissertation. Search for these copies in the library's main catalog: http://catalog.princeton.edu/	-
dc.subject	concurrent hash table	-
dc.subject	cost-efficiency	-
dc.subject	hardware transactional memory	-
dc.subject	key-value storage	-
dc.subject	load balancing	-
dc.subject	OpenFlow switching	-
dc.subject.classification	Computer science	-
dc.title	Systems and Algorithms for High-Performance, Cost-Efficient Key-Value Storage	-
dc.type	Academic dissertations (Ph.D.)	-
pu.projectgrantnumber	690-2143	-
Appears in Collections:	Computer Science

Files in This Item:

File	Description	Size	Format
Li_princeton_0181D_11698.pdf		2.02 MB	Adobe PDF	View/Download

Show simple item record

Search

Browse