Skip navigation
Please use this identifier to cite or link to this item:
Title: Streaming Computation on Error-Prone Programmable Platforms
Authors: Yetim, Yavuz
Advisors: Martonosi, Margaret
Malik, Sharad
Contributors: Electrical Engineering Department
Subjects: Electrical engineering
Issue Date: 2015
Publisher: Princeton, NJ : Princeton University
Abstract: As semiconductor technology scales towards ever-smaller transistor sizes, hardware fault rates are increasing due to process variation, reduced noise margin, aging effects, and increased susceptibility to soft errors. Reliability can be regained through redundancy, error checking with recovery, voltage scaling and other means, but these techniques impose area/energy costs. Since important application classes (e.g., multimedia, streaming workloads) are data-error-tolerant, recent research has proposed techniques that seek to save energy or improve yield by exploiting error tolerance at the architecture/microarchitecture level. So far reliability research has largely focused on errors affecting program data and is not general enough to handle arbitrary bit errors. Notably, although some applications may be tolerant to errors affecting the program data, e.g. image pixel value errors, error-prone programmable platforms may experience errors that corrupt the control-flow or even cause exceptions that terminate the program. When accounting for the data and control-flow dependencies, approximately two-thirds of instructions can lead to such crashes or unresponsive states. In response, I propose coarse-grain protection mechanisms to detect catastrophic outcomes and guide the application execution. These mechanisms protect the system against crashes, unresponsiveness, external device corruptions and also provide support for achieving acceptable quality. For example, coarse-grain control-flow protection mechanisms ensure the sequencing of time-bounded coarse-grain compute operations. Similarly, errors may cause data misalignments that degrade the output quality permanently in parallel streaming applications running on error-prone processors, but the coarse protection mechanisms use explicit communication directives in high level programming languages, such as StreamIt, to pad or discard data for realignment. Overall, I propose coarse-grain protection mechanisms that convert potentially fatal errors to potentially tolerable data errors instead of ensuring instruction-level or byte-level correctness. In summary, this thesis addresses requirements for error-tolerant execution by proposing and evaluating techniques for running data error-tolerant streaming applications on general-purpose processors built from an unreliable fabric. My studies show how low-overhead microarchitectural modules can use coarse-grain application information to enable streaming computation on error-prone processors. As a result, both sequential and parallel applications can provide good output quality on partially-protected uniprocessors and on multicore processors composed of partially protected uniprocessor cores, respectively.
Alternate format: The Mudd Manuscript Library retains one bound copy of each dissertation. Search for these copies in the library's main catalog
Type of Material: Academic dissertations (Ph.D.)
Language: en
Appears in Collections:Electrical Engineering

Files in This Item:
File Description SizeFormat 
Yetim_princeton_0181D_11239.pdf2.42 MBAdobe PDFView/Download

Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.