Skip navigation
Please use this identifier to cite or link to this item:
Title: Optimizing Content Distribution Network Caches with Machine Learning
Advisors: LloydLi, WyattKai
Contributors: Computer Science Department
Subjects: Computer science
Issue Date: 2023
Publisher: Princeton, NJ : Princeton University
Abstract: Content Distribution Networks (CDNs) play a pivotal role in Internet traffic. A key part of this caching mechanism is the eviction algorithm that handles the replacement of old cached objects. The effectiveness of the eviction algorithm significantly influences CDN performance. This dissertation explores the application of machine learning (ML) to optimize cache eviction algorithms in CDNs. The central questions addressed in this work are: how to utilize ML to devise an eviction algorithm that surpasses existing heuristics on byte miss ratio, and how to mitigate the CPU overhead while enhancing the robustness of a learned cache in large-scale deployment. Two major challenges faced in the design of a learning-based cache eviction algorithm include heterogeneous user access patterns across different locations and times, and computational and space overheads. To address these challenges, we developed two ML-based eviction algorithms, Learning Relaxed Belady (LRB) and Heuristic Aided Learned Preference (HALP). LRB is the first CDN cache algorithm to directly approximate the Belady MIN (oracle) algorithm by learning access patterns, providing a significant improvement over traditional eviction algorithms. It demonstrated a WAN traffic reduction of 4-25\% across six production CDN traces in our simulation. HALP, on the other hand, achieves low CPU overhead and robust DRAM byte miss ratio improvement by augmenting a heuristic policy with ML. It has shown to reduce DRAM byte miss during peak by an average of 9.1\%, with a modest CPU overhead of 1.8\%, while deployed in YouTube CDN production clusters. This study contributes towards using machine learning to develop robust cache eviction algorithms with low miss ratios and low overheads, thereby enhancing the efficiency of CDNs. The findings of this research have been applied in industry deployment with significant production impact.
Type of Material: Academic dissertations (Ph.D.)
Language: en
Appears in Collections:Computer Science

Files in This Item:
File Description SizeFormat 
SONG_princeton_0181D_14731.pdf3.66 MBAdobe PDFView/Download

Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.