Novel framework for cache management

Researchers from the Tufts Computer Architecture Laboratory develop a new framework to produce contention-aware, lightweight repartitioning algorithms.
Mark Hempstead, Cesar Gomes, and Maziar Amiraski
Associate Professor Mark Hempstead and PhD candidate Cesar Gomes in the Tufts Computer Architecture Laboratory. Inset: PhD candidate Maziar Amiraski.

With the steady march forward of technological development, there is increasing pressure on scarce shared computing resources as the number of cores on a chip continues to rise. The need for scarce resources like last-level cache or system cache, which reduces power consumption while increasing bandwidth availability, becomes ever greater. Administrators and programmers often find themselves pushing to wring as much performance and computing power as possible from systems.

It is with those needs in mind that a Tufts University Department of Electrical and Computer Engineering research team developed Contention Analysis in Shared Hierarchies with Thefts, or CASHT. CASHT is a framework to produce contention-aware, lightweight repartitioning algorithms.

A computing cache stores data for the future, allowing that data to be delivered quickly when it’s requested again. It’s why, for instance, users might be asked to clear their web browser’s cache to see the most recent version of a webpage after changes have been made – their cache stored an earlier version of the page.

CPU designers like Intel use cache partitioning to designate a part of the cache so it is used only by one application, to ensure that the application runs well. Picking the size of this partition is tricky – too small and the application won’t have any benefit from the cache, but too large and the other workloads will start missing in the cache and running slower. Misses happen when the request for a certain block of data is not filled by the cache. The CASHT researchers chose not to measure contention indirectly through variations in misses. They also decided not to measure it directly through events that don’t account well for shared cache occupancy, since traditionally, methods of measuring misses assume that only a single workload is functioning, which is often not true as computing workloads grow in complexity.

Instead, the researchers utilized inter-core evictions – which they called THEFTS – to track cache contentions. They identified the cause and effect of cache evictions to determine whether the eviction, or data block release from the cache to create space for new files, stemmed from accesses from the workload or from another workload. They also developed an Agnostic Contention Estimation (ACE) method to detect when partitions prevent thefts.

The research team found that CASHT compared favorably to Utility-based Cache Partitioning (UCP), a well-known repartitioning algorithm already in wide use. They used CASHT’s trained learning model to collect cache statistics probabilistically, learning and predicting the most efficient ways to partition a cache and then scaling the solution for more than two cores. The lightweight Gradient-boosting-tree (GBT) supervised learning models trained by the researchers removed the need for extra logic overhead and extra time for data collection. 

Ultimately, the research team found that the best partitioning solutions do not come from contention-unaware data. Using probabilistic sampling, a new algorithm for scaling a GBT model to workloads of more than two cores, and the Agnostic Contention Estimation method, CASHT presents a novel framework for contention analysis and cache management.

Read more in “CASHT: Contention Analysis in Shared Hierarchies,” published in ACM Transactions on Architecture and Code Optimization by Tufts PhD candidates Cesar Gomes and Maziar Amiraski and Associate Professor Mark Hempstead.