[API Proposal]: Cache Synchronization for Hybrid Caching in Multi-Node Environments
Background and motivation
In a multi-replica environment utilizing hybrid caching (in-memory and out-of-process), cache desynchronization between nodes can occur because there is no built-in mechanism to synchronize in-memory caches across nodes behind a load balancer. This results in inconsistent cache states, reducing the reliability of the system.
This proposal addresses the problem by introducing an event-driven mechanism to ensure cache synchronization across nodes.
Problem Context
Hybrid caching involves two main components:
- Out-of-process cache: This ensures a single source of truth, making cache invalidation simple and effective across nodes.
- In-memory cache: While useful for quick access, it poses challenges in multi-node environments due to the lack of cross-node communication when caches are reset.
When a cache is reset in one node, other nodes do not get notified, leading to cache desynchronization across the system.
Problem Statement
The current hybrid caching model does not offer a built-in mechanism to notify all nodes about an in-memory cache reset, resulting in inconsistent cache states between nodes in a multi-node environment.
API Proposal
Proposed Solution
Overview
Introduce a Publisher-Subscriber model using webhooks, event queues, or other notification mechanisms to propagate cache reset events to all nodes using the hybrid cache. This model will allow one node (the Publisher) to notify other nodes (Subscribers) when a cache reset happens, ensuring synchronization of the in-memory cache across all nodes.
Key Features
-
Webhook-based/Callback mechanism: Each node registers as a Subscriber to receive notifications when cache resets happen. The node initiating the reset acts as the Publisher.
-
Retry strategy: In case of a failure in notifying a node, the system retries the notification, ensuring robustness in cache synchronization.
-
Multi-provider support: While webhooks are the default, the design allows support for other messaging systems like
event queues,SignalR, etc.
API Changes
-
Add a
CacheResetNotificationclass:- Encapsulates the logic for broadcasting cache reset events to other nodes.
public class CacheResetNotification { public void NotifyAllSubscribers(string cacheKey); public void RegisterSubscriber(IList<Uri> subscriberUris); public void UnregisterSubscriber(IList<Uri> subscriberUris); }
API Usage
services.AddHybridCache(options =>
{
options.UsePublisherSubscriberModel()
.AddWebhookSubscriber(uri => new IList<Uri> ( {Uri("https://node1/reset")}));
});
Alternative Designs
Polling Mechanism: This was dismissed due to inefficiency and increased load on the nodes.
Risks
We need to consider network flooding or over requesting between nodes , so we can define a sync period between nodes/keys count thresh hold or something like this
This would be a nice feature, but I'd prefer to use something like Redis pub/sub directly assuming that's what we are using for the L2 cache.
But like this it would make every one coupled with Redis,
I mean we could have it as an optional provider , but we need some default with no extra setup that's why I suggested web-hooks
We could have multiple providers like but not limited to 1-redis pub/sub 2-Event buses 3-Event signal R hub
Hello , @mgravell
could you please give us your invaluable input on this ? to be honest I'm eager to help
@IbrahimMNada I think @rsalus meant here the message system, which would could be done in multiple ways like additional cache record type in pernament storage and readed as messages, or done as real messaging pattern, or message queue patter. there is a lot ways how it can be approached.
Personally I dont think usage of webhooks is good approach here.
@bielu I agree with you web hooks might not be optimal here , but I think we need some loyalty free approach here , which will be independent from any third party libraries.
of course we ill have ad-ons like for example Redis pub/sub but we need some default for the ones whos not using. any Redis/Message queue in their applications.
and by going to a shared storage every time we wanted to read from the in memory cache this negates the value of having in memory cache.
so we might do a simple (file-system) based message broker will notify other node of a cache that need to be evicted
and by going to a shared storage every time we wanted to read from the in memory cache this negates the value of having in memory cache. it is not what I suggested... I suggest having new value of sync instructions which is checked in background thread not reading from source every time. So making message broker by reusing source cache...
and by going to a shared storage every time we wanted to read from the in memory cache this negates the value of having in memory cache.
so we might do a simple (file-system) based message broker will notify other node of a cache that need to be evicted
pub/sub is pretty darn efficient and is the standard solution for a variety of implementations. see FusionCache for example.
I doubt a file-system based broker would be feasible given the limitations imposed by IO. webhook might be doable but would be rather inefficient (not to mention the security implications).
marc had this to say in another thread.
Okay , lets agree on something here :
Regardless of the method of cache invalidation you will choose , it should be decupled from hybrid cache itself, and it should be extendable as you guys always do,
so I suggest for starter that we introduce events to notify any concerned parties :
what we could do is to introduce a new property in HybridCacheOptions Called Events and it contains three properties
OnGet => Action/Func<> returns (Key , CacheItem<Object>)
OnSet => Action/Func<> returns (Key , CacheItem<Object>)
OnDelete => Action/Func<> returns (Key)
and on the Startup.cs the consumer can do what he likes , Send an Integration event, Pub/Sub , MediatR notification options are endless......
and based on this we could easily build extensions to handle various providers.
if you think this is a good place to start from please tell me , I'm ready to implement it.
pub/sub is pretty darn efficient and is the standard solution for a variety of implementations. see FusionCache for example.
FusionCache creator here: can confirm, works like a charm, but as already said it's (most probably) the same approach that will be used with HybridCache itself.
Also, just to be clear: FusionCache is implementation agnostic, even for the backplane. Currently it's the main one, but others can be created based on different technologies.
Hope this helps.
I would argue that adding something similar to HybridCacheEntryOptions.Flags to HybridCache.RemoveAsync parameters to allow users to clear out only local cache would be a great first step for user-provided workarounds for the time the actual API is figured out.
Currently I don't see an option to invalidate local cache only (other than trying some dirty hack with Set) .
Working on the design for this for Working on the plan for this, hopefully for net10
Working on the design for this for Working on the plan for this, hopefully for net10
All the best!
if you find anything worth labeling Help-wanted i would be delighted to help
Edit: Removed my polemic wording
Any progress on this?
For the record, I do not endorse the above statement, at least apart from the technical question.
Working on the design for this for Working on the plan for this, hopefully for net10
Is this plan dropped?
Is this plan dropped?
If you need the feature today you can use FusionCache (creator here). You can use it either directly or as an impl of HybridCache (see here), so there will be no direct dependencies in your code. The multi-node notification feature is called Backplane, and it works even when FusionCache is used as a HybridCache impl, so it's totally transparent.
Hope this helps.
NB: to be clear I'm not from MS and know nothing about their plans, I just wanted to help.