Please use this identifier to cite or link to this item: http://arks.princeton.edu/ark:/88435/dsp014t64gn30f
DC FieldValueLanguage
dc.contributor.authorHuang, Jialuen_US
dc.contributor.otherComputer Science Departmenten_US
dc.date.accessioned2013-09-16T17:27:07Z-
dc.date.available2013-09-16T17:27:07Z-
dc.date.issued2013en_US
dc.identifier.urihttp://arks.princeton.edu/ark:/88435/dsp014t64gn30f-
dc.description.abstractHarnessing the performance potential of multicore processors requires scalable parallel programs. Automatic parallelization techniques are a promising approach for producing well-performing parallel programs. Nevertheless, most existing techniques parallelize only independent loops and insert global synchronizations at the end of each loop invocation. For programs with few loop invocations, these global synchronizations do not limit parallel execution performance. However, for programs with many loop invocations, those synchronizations can easily become the performance bottleneck since they frequently force all threads to wait, losing potential parallelization opportunities. To address this problem, some automatic parallelization techniques apply static analyses to enable cross-invocation parallelization. Instead of waiting, threads can execute iterations from follow-up invocations if they do not cause any conflict. However, static analysis must be conservative and cannot handle irregular dependence patterns manifested by particular program inputs at runtime. In order to enable more parallelization across loop invocations, this thesis presents two novel automatic parallelization techniques: DOMORE and SpecCross. Unlike existing techniques relying on static analyses, these two techniques take advantage of runtime information to achieve much more aggressive parallelization. DOMORE constructs a custom runtime engine which non-speculatively observes dependences at runtime and synchronizes iterations only when necessary; while SpecCross applies software speculative barriers to permit some of the threads to execute past the invocation boundaries. The two techniques are complimentary in the sense that they can parallelize programs with potentially very different characteristics. SpecCross, with less runtime overhead, works best when programs' cross-invocation dependences seldom cause any runtime conflict. DOMORE, on the other hand, has its advantage in handling dependences which cause frequent conflicts. Evaluating implementations of DOMORE and SpecCross demonstrates that both techniques can achieve much better scalability compared to existing automatic parallelization techniques. Among twenty programs from seven benchmark suites, DOMORE is automatically applied to parallelize six of them and achieves a geomean speedup of 2.1× over codes without cross-invocation parallelization and 3.2× over the original sequential performance on 24 cores. SpecCross is found to be applicable to eight of the programs and it achieves a geomean speedup of 4.6× over the best sequential execution, which compares favorably to a 1.3× speedup obtained by parallel execution without any cross-invocation parallelization.en_US
dc.language.isoenen_US
dc.publisherPrinceton, NJ : Princeton Universityen_US
dc.relation.isformatofThe Mudd Manuscript Library retains one bound copy of each dissertation. Search for these copies in the <a href=http://catalog.princeton.edu> library's main catalog </a>en_US
dc.subjectAutomatic Parallelizationen_US
dc.subjectCross-Invocationen_US
dc.subjectRuntimeen_US
dc.subjectSpeculativeen_US
dc.subject.classificationComputer scienceen_US
dc.titleAUTOMATICALLY EXPLOITING CROSS-INVOCATION PARALLELISM USING RUNTIME INFORMATIONen_US