Skip navigation
Please use this identifier to cite or link to this item:
Title: ASAP: Automatic Speculative Acyclic Parallelization for Clusters
Authors: Kim, Hanjun
Advisors: August, David I
Contributors: Computer Science Department
Keywords: Automatic parallelization
Transactional memory
Subjects: Computer science
Issue Date: 2013
Publisher: Princeton, NJ : Princeton University
Abstract: While clusters of commodity servers and switches are the most popular form of large-scale parallel computers, many programs are not easily parallelized for clusters due to high inter-node communication cost and lack of globally shared memory. Speculative Decoupled Software Pipelining (Spec-DSWP) is a promising automatic parallelization technique for clusters that speculatively partitions a loop into multiple threads that communicate in a pipelined manner. Speculation can complement conservative static analysis, making automatic parallelization more robust and applicable. Pipelining allows Spec-DSWP to speculate only rarely occurring dependences while respecting the other dependences through communication among threads. Acyclic communication patterns in pipelining make the parallelized programs tolerant of high communication latency of clusters. However, since Spec-DSWP partitions a loop iteration (a transaction) into multiple sub-transactions across multiple threads according to the pipeline stages, a special runtime system is required that supports multi-threaded transactions (MTXs). This dissertation proposes the Automatic Speculative Acyclic Parallelization (ASAP) system that enables Spec-DSWP for clusters without any hardware modification. The ASAP system supports various speculation techniques that require different validation and communication costs, and automatically parallelizes sequential loops using the Spec-DSWP transformation with the optimal application of the speculation techniques. The ASAP system efficiently supports MTXs to correctly execute the speculatively transformed programs on clusters. With synergistic combination of speculation, acyclic communication, and runtime system support, this approach achieves or demonstrates a path to achieve scalable performance speedup up to 109x for a wide range of applications on clusters without any hardware modification.
Alternate format: The Mudd Manuscript Library retains one bound copy of each dissertation. Search for these copies in the library's main catalog
Type of Material: Academic dissertations (Ph.D.)
Language: en
Appears in Collections:Computer Science

Files in This Item:
File Description SizeFormat 
Kim_princeton_0181D_10623.pdf1.7 MBAdobe PDFView/Download

Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.