HybridCache - tags and invalidation · Issue #55308 · dotnet/aspnetcore (original) (raw)
Background and Motivation
This is part of Epic: IDistributedCache updates in .NET 9
The first wave (preview 4) delivers the basic infrastructure for multi-tier caching based on IMemoryCache
(L1) and IDistributedCache
(L2), but it is intended to add 3 additional features that require L2 support:
- active key invalidation
- active tag invalidation
- tag metadata lookup (for cold-start invalidation)
These features must be optional and implemented in a way that does not place fundamentally new demands on IDistributedCache
- for example, the existing "set" API is simply an opaque string key and BLOB value. In particular, because we're talking about out-of-process data, it is not possible to use IChangeToken
for this purpose (although a consuming layer could choose to use IChangeToken
locally as part of responding to the themes of this proposal.
Active key invalidation
Right now, invalidation occurs passively via time/expiration, or as write-thru side-effects from operations on the same node; in a multi-node scenario this is insufficient and does not account for subsequent update/delete by other nodes, which can lead to inconsistent L1 cache state.
To remedy this, the following event is proposed on a new interface:
namespace Microsoft.Extensions.Caching.Distributed;
public delegate void DistributedCacheKeyInvalidation(string key, ReadOnlySpan header);
public interface IDistributedCacheInvalidation : IDistributedCache { int HeaderBytes { get; set; } event DistributedCacheKeyInvalidation KeyInvalidated; // addition here from 2/3; will be combined in summary }
The idea behind this API is that L2 backends may, through some mechanism (queuing, pub/sub, etc) broadcast key updates. When a node performs writes (Set[Async](...)
/Remove[Async](...)
, it should (via that implementation-specific mechanism) also publish a global invalidation entry. In the case of Set[Async]
, the first HeaderBytes
bytes of the payload may optionally also be published. This invalidation mechanism will be used to invoke KeyInvalidated
at arbitrary times, for out-of-band invalidation notification. Emphasis: HeaderBytes
is set by the consumer (HybridCache
, etc), and indicates "if you're publishing: publish this much of the payload". The value may be zero, and/or the implementation may choose to ignore this and not include any payload metadata, just publishing the keys.
Note that the use of ReadOnlySpan<byte>
precludes the use of Action<string, ...>
; this use was intentional, making the lifetime semantics of header
very clear - if it is needed beyond right now, the consumer must copy the value to somewhere they control. The name DistributedCacheKeyInvalidation
is perhaps questionable.
The intent here is that HybridCache
(or other consumers) can subscribe to KeyInvalidated
, and respond accordingly. The key
here matches the same string
as described by string key
in Set[Async]
etc, noting that if the L2 has some configured namespace prefix, the L2 implementation is responsible for removing that key again, such that the key
in KeyInvalidated
is the original key transmitted.
The purpose of the header
is to help avoid trivial removals. If the header
received is empty, the invocation should be treated as a blind "delete", causing L1 removal. This is a fair default, but it is not assumed that implementations can automatically detect and avoid same-connection notifications, which means we must anticipate and avoid:
- (normal) node X updates key ZZZ to value ABC in L1 and L2
- (normal) the invalidation event {ZZZ, ABC} is published
- (normal) the
KeyInvalidated
event on node X gets invoked with {ZZZ, ABC} for the same thing we just caused - (problem) node X removes L1 entry ZZZ (unnecessarily)
- (problem) node X now gets cache miss on ZZZ and hits L1 again (unnecessarily)
The header
allows us to avoid these last two steps; the implementation of this is consumer-dependent, but in the case of HybridCache
, the payload sent to L2 will include a payload header that includes the creation timestamp and a disambiguation qualifier (which, along with the creation timestamp, essentially work like an ETag); by parsing these (which will be in the first few bytes):
- if no corresponding L1 entry is found: there is nothing to do
- if there is no
header
, or any unexpectedheader
size: blind delete from L1 - if the incoming creation timestamp is less than the L1 entry: do nothing (our data is considered fresher)
- if the incoming creation timestamp and disambiguation qualifier both match the L1 entry: do nothing (we're being told about our own update)
- otherwise: delete from L1
This provides a mechanism to communicate L2 invalidation to L1, and respond without causing self-invalidation, within the constraints of the data available to IDistributedCache
.
Tagging
Tagging is a new concept being introduced into HybridCache
that does not historically exist in IDistributedCache
. To achieve this, the L2 tag metadata will also be stored as part of the header (although not typically in the bytes published for KeyInvalidated
).
It is assumed that IDistributedCache
cannot reliably implement cascading delete at the backend - this is simply not a feature in many key/value stores, and while it can be hacked in: it is usually unsatisfying and requires significant additional overhead. We want to avoid this complexity in the backend.
Consequently, HybridCache
must implement this internally, by maintaining a lookup of each tag to the last known invalidation date, for example we might have (using numbers instead of dates here for simplicity):
- tag "north"; invalidation date: 513
- tag "offers"; invalidation date: 400
- tag "east"; invalidation date: 234
When loading an entry from L1 or L2, if that entry has tags we must compare the creation date of the cache entry (again, from the payload header) to the dates in each of the tags; if any tag has an invalidation date greater than the cache entry's creation date, it is considered logically expired (it can also be removed from L1/L2 accordingly). For example:
- cache entry ZZZ, creation date 450 tagged "north" and "offers" is considered expired because "north" was invalidated at time 513
- cache entry YYY, creation date 450 tagged "east" and "offers" is considered valid
To support this, we still need some additional backend capabilities:
- some mechanism to publish tag invalidations, similar to
KeyInvalidated
- some mechanism to lookup tag invalidation data from cold-start
For this,, we propose:
namespace Microsoft.Extensions.Caching.Distributed;
public interface IDistributedCacheInvalidation : IDistributedCache { // (not shown; from 1) event Action<string, DateTimeOffset> TagInvalidated; Task RemoveByTagAsync(string tag, CancellationToken token = default); // API to bulk-query tag eviction metadata Task<KeyValuePair<string, DateTimeOffset>[]> GetTagsAsync(DateTimeOffset since = default, CancellationToken token = default); // alternative single-tag metadata query API Task<DateTimeOffset?> GetTagAsync(string tag, CancellationToken token = default); }
At cold-start, the library can use GetTagsAsync
to pre-populate the tag lookup with some reasonable time bound, and can respond to TagInvalidated
to update this data (forwards-only) as needed. The choice of array here is intentional, as it is assumed the caller will be constructing their own lookup by iterating the data, hence "simple" is reasonable. This could arguably be IAsyncEnumerable<>
, etc, but: it is only used for cold-start population of the tag metadata, so array overhead is not burdensome.
To invalidate a specific tag, we call RemoveByTagAsync
, which would update the data used by GetTagsAsync
and also indirectly cause TagInvalidated
to be invoked by all clients. Note that unlike write-thru invalidation, tag invalidation doesn't have the problem of invalidating our own data, as it is not entry-specific.
Combining these two halves, we get the API proposal:
namespace Microsoft.Extensions.Caching.Distributed;
public delegate void DistributedCacheKeyInvalidation(string key, ReadOnlySpan header);
public interface IDistributedCacheInvalidation : IDistributedCache
{
int HeaderBytes { get; set; }
event DistributedCacheKeyInvalidation KeyInvalidated;
event Action<string, DateTimeOffset> TagInvalidated;
Task RemoveByTagAsync(string tag, CancellationToken token = default);
Task<KeyValuePair<string, DateTimeOffset>[]> GetTagsAsync(DateTimeOffset since = default, CancellationToken token = default);
Task<DateTimeOffset?> GetTagAsync(string tag, CancellationToken token = default);
}
Example implementation
Redis:
(without any special server features)
- subscribe to two channels,
__MSFT_DC__KeyInvalidation
and__MSFT_DC__TagInvalidation
- treat
__MSFT_DC_Tags
as a sorted-set - key delete is
UNLINK {key}
plusPUBLISH __MSFT_DC__KeyInvalidation {key}
- key write is
SET {key} {value} EX {ttl}
plusPUBLISH __MSFT_DC__KeyInvalidation {key+header}
- tag invalidation is
ZADD __MSFT_DC_Tags GT {time} {tag}
plusPUBLISH __MSFT_DC__TagInvalidation {tag+time}
(on servers before 6.2, do not useGT
) - get tags is
ZRANGE __MSFT_DC_Tags {since} +inf BYSCORE WITHSCORES
- get tag is
ZSCORE __MSFT_DC_Tags {tag}
The implementation may also choose to use ZREMRANGEBYSCORE __MSFT_DC_Tags -inf {culltime}
periodically, for some culltime
that represents the largest possible expiration; this allows long-dead tags to be forgotten.
The channel and tags name should include any namespace partition configured, just like keys. The published tags/keys do not need to include the partition.
It is also possible to use server-assisted client-side caching or keyspace notifications, which may be considered in due course, but initially: active invalidation (i.e. where our code explicitly causes the pub/sub events) is described for simplicity, since this does not require server-side feature configuration (which is required for keyspace notifications) or an up-level server (server-assisted client-side caching requires server version 6 and client library support)
Alternative designs
The "output caching" feature is comparable in terms of supporting L2 tagging (without notification); because the concept of tags was baked into the original API, it is implemented in the backend - in SQL via relational DB semantics, and in Redis by using a SADD {tag} {key}
such that tag
is the redis "set" consisting of all keys associated with that tag; deleting a tag means enumerating the "set" and calling UNLINK
per key, and also requires complicated periodic garbage collection to remove expired keys from each set - and to do that, we need an additional set which is the set of all known tags. The solution proposed here is much simpler to implement, and fits within the current API. So much so that when HybridCache
has full tag support, I wonder if it is worth exploring a mechanism to implement IOutputCacheBufferStore
on top of HybridCache
.