Skip navigation
Please use this identifier to cite or link to this item:
Title: Dynamic Techniques for Mitigating Inter- and Intra-Application Cache Interference
Authors: Wu, Carole-Jean
Advisors: Martonosi, Margaret
Contributors: Electrical Engineering Department
Keywords: Cache interference
Chip-multiprocessor caches
Intra-application cache interference
Operating system cache effects
Prefetcher management
Re-reference interval prediction
Subjects: Computer engineering
Issue Date: 2012
Publisher: Princeton, NJ : Princeton University
Abstract: Given the emerging dominance of chip-multiprocessor (CMP) systems, an important research problem concerns application memory performance in the face of deep memory hierarchies, where one or more caches are shared by multiple cores. Often, when several applications compete for capacity in shared caches, the performance of multiprogrammed and parallel workloads degrades significantly and becomes unpredictable. This happens because the commonly-used Least-Recently-Used replacement policy does not distinguish between processes and their distinct memory needs. Therefore, processes often suffer from such inter-application cache interference and the overall system throughput can be slowed down by as much as 55%. In addition to managing multiple applications sharing the last-level cache (LLC), managing a single application's memory performance is far from straightforward even in an idealized setup, considering only user accesses. It becomes even more challenging in real-machine environments, where interference can stem from operating system (OS) activities, and even from an application's own prefetch requests and page table walks caused by Translation Lookaside Buffer (TLB) misses. Using hardware performance counters on existing CMPs, this thesis characterizes such intra-application cache interference in the LLC and shows that application data references represent much less than half of the LLC misses, with hardware prefetching and page table walks causing considerable intra-application cache interference. The primary focus of my thesis is to address the challenges of both inter- and intra-application cache interference through hardware and software design mechanisms. My thesis focuses on each of these issues across a range of computing application domains and tackles an overarching research problem: Addressing intra- as well as inter-application cache interference, stemming from user applications, OS, and hardware prefetching via dynamic management to achieve better and more predictable performance improvement. The intelligent LLC management proposed in this thesis can speed up execution for a diverse range of applications by taking into account the memory requirement of co-scheduled applications, OS reference characteristics, and hardware prefetching. For mitigating the degree of contention when multiple applications are accessing the shared LLC simultaneously, my thesis proposes OS priority-aware and signature-based cache capacity management techniques. In particular, by correlating memory reuse characteristics and each memory request's unique signature, such as an instruction's program counter or a sequence of instruction types, my thesis demonstrates that the proposed capacity management techniques allow the shared LLC to be utilized more effectively than other state-of-the-art techniques. While inter-application interference has received significant research attention, intra-application interference is less well studied. Based on the detailed characterization of intra-application cache interference in the face of a modern OS and current, aggressive hardware prefetchers on an existing system, my thesis proposes and evaluates dynamic management techniques to address inter- as well as intra-application cache interference. Furthermore, my thesis also demands cache research proposals to carefully account for real-system effects while proposing and evaluating new cache management designs. Overall, this thesis offers a mix of real-system characterizations with detailed evaluations of hardware and system proposals that can help guide future OS and architecture work regarding the importance and challenges of both inter- and intra-application cache interference. Via signature-based, prefetch- and OS-aware approaches, performance is improved by as much as 65% and with an average of 22% execution time speedup. These techniques establish the importance of making user application performance more predictable in the face of system complexities and deep memory hierarchies.
Alternate format: The Mudd Manuscript Library retains one bound copy of each dissertation. Search for these copies in the library's main catalog
Type of Material: Academic dissertations (Ph.D.)
Language: en
Appears in Collections:Electrical Engineering

Files in This Item:
File Description SizeFormat 
Wu_princeton_0181D_10156.pdf25 MBAdobe PDFView/Download

Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.