Cache Evicting

Slack Docker Pulls

Cache Evicting Overview

As the storage space used by Alluxio is limited, the Cache Evicting feature evicts old data through several strategies to ensure that there is enough storage space to cache new data.

There are two different ways Alluxio will evict its cache:

  • Evict on Writing
  • Background Asynchronous Evicting

Evict on Writing

Evict on writing is to synchronously check and eliminate the cached data when writing pages in Alluxio. The eviction will be triggered when Alluxio is about to write a page that would cause the total cache to exceed the storage capacity.

Cache Evictors

Alluxio provides the following five evictors to evict cached data:

  • LRUCacheEvictor (default): LRU cache eviction policy
  • FIFOCacheEvictor: FIFO cache eviction policy
  • LFUCacheEvictor: LFU cache eviction policy. Pages are sorted in bucket order based on logarithmic count. Pages inside the bucket are sorted in LRU order.
  • NondeterministicLRUCacheEvictor: LRU with non-deterministic cache eviction policy. Uniformly evict elements in the LRU tail.
  • TwoChoiceRandomEvictor: Two Choice Random client-side cache eviction policy. It selects two random page IDs and evicts the one least-recently used.

The worker cache and client cache have separate properties to define their respective evictor. For example, the following configuration in alluxio-site.properties sets LRUCacheEvictor for both worker and client-side caches.

alluxio.worker.page.store.evictor.class=alluxio.client.file.cache.evictor.LRUCacheEvictor
alluxio.user.client.cache.evictor.class=alluxio.client.file.cache.evictor.LRUCacheEvictor

Background Asynchronous Evicting

Alluxio supports setting different constraints on the cache space that will trigger an eviction:

  1. Setting a limit on the total size that the pages can occupy, i.e. the capacity of the page store;
  2. Setting a limit on the total number of pages in the page store. On every page put operation, Alluxio checks if any of the constraints is violated. If a constraint is violated, a synchronous eviction takes place to make room for the incoming page. However, synchronous eviction on writing will degrade performance dramatically. Background asynchronous eviction aims at evicting cached data asynchronously beforehand to avoid evicting cached data during a write operation.

Eviction based on Capacity

To enable the background asynchronous evicting feature, add the following configurations to alluxio-site.properties:

alluxio.user.client.cache.async.eviction.enabled=true
alluxio.user.client.cache.async.eviction.check.interval=1min
alluxio.user.client.cache.async.eviction.high.water.mark=0.9
alluxio.user.client.cache.async.eviction.low.water.mark=0.8

By setting the above configuration, Alluxio will create a background thread that checks if the page cache space reaches the high water mark threshold. Once this condition is triggered, it will evict cached pages until the low water mark threshold is reached. The background thread will check periodically defined by the check interval property.

Eviction based on Limit on Number of Pages

Alluxio asynchronously evicts pages that exceeds the limit, in a background thread that periodically scans for excessive pages. When the total page number exceeds highWatermark * maxPageNumberLimit, it triggers an eviction, until the total page number drops below lowWatermark * maxPageNumberLimit.

To enable this async eviction by page number limit, set alluxio.worker.page.store.max.page.number.limit.enabled to true. The limit on the maximum number of pages can be specified by alluxio.worker.page.store.max.page.number:

alluxio.user.client.cache.async.eviction.enabled=true
alluxio.user.client.cache.async.eviction.check.interval=1min
alluxio.user.client.cache.async.eviction.low.water.mark=0.6
alluxio.user.client.cache.async.eviction.high.water.mark=0.8

alluxio.worker.page.store.max.page.number.limit.enabled=true
alluxio.worker.page.store.max.page.number=100000

REST API for Updating Configurations Dynamically

Alluxio provides the following REST APIs for users to set and get async eviction configurations dynamically:

  • Enable async eviction and update the related parameters
curl --location --request POST 'localhost:28080/v1/cache?cmd=enableCacheAsyncEviction&chacheEvictionCheckInterval=30&highWaterMark=0.8&lowWaterMark=0.5'
  • Disable async eviction
curl --location --request POST 'localhost:28080/v1/cache?cmd=disableCacheAsyncEviction'
  • Get the current async eviction parameters
curl --location 'localhost:28080/v1/cache?cmd=getPageCacheAsyncEvictionManagerInfo'