Distributed storage (and other features) for direct get and set
We use sturdyc in many different scenarios, including manual upserts to cache from our logic. It works fine, but by doing so, we are completely missing some features, including distributed storage. Although I do not understand the logic and where it came from, it's a bit odd - if I update via function, the cache is propagated, written to DS, and so on. But when I do it by hand - nada.
I would like to request consideration and an extension of the logic to manual setters and getters.
As I haven't analyzed it yet (sorry, until today I was just using team while my team was trying to dig deeper) it seems like the method Set() is just directly writing data to in-memory cache with no calls to distributed cache, but in comparison GetOrFetch is calling a somewhat feature-rich distributedFetch function that can handle additional calls to DS.
Hi, happy to hear you're finding multiple scenarios to use this library!
Mixing GetOrFetch and GetOrFetchBatch functions with regular Get and Set operations to perform manual upserts is not very straightforward at the moment.
Let me try to illustrate why. Suppose you're using the cache in front of an API that returns shipping routes like this:
type GetShippingRouteOpts struct {
IncludeAlternativeShippingRoutes bool
}
func (a *API) GetShippingRoute(ctx context.Context, zipCode string, opts GetShippingRouteOpts) (ShippingRoute, error) {
fetchFn := func(ctx context.Context) (ShippingRoute, error) {
timeoutCtx, cancel := context.WithTimeout(ctx, time.Second)
defer cancel()
var response ShippingRoute
err := requests.URL(c.baseURL).
Path("/shipping-routes").
Param("include_alternative_shipping_routes", strconv.FormatBool(opts.IncludeAlternativeShippingRoutes)).
ToJSON(&response).
Fetch(timeoutCtx)
return response, err
}
return a.cache.GetOrFetch(ctx, a.cache.PermutatedKey(opts), fetchFn)
}
Now imagine calling the API this way:
apiClient.GetShippingRoute(ctx, "12345", GetShippingRouteOpts{IncludeAlternativeShippingRoutes: true})
apiClient.GetShippingRoute(ctx, "12345", GetShippingRouteOpts{IncludeAlternativeShippingRoutes: false})
Even though the ID is the same (the ZIP code), the data can vary significantly depending on the options passed (i.e., whether IncludeAlternativeShippingRoutes is set to true or false). As a result, the cache stores the shipping route for that ZIP code once for each permutation:
include_alternative_shipping_routes_false_12345
include_alternative_shipping_routes_true_12345
However, let’s now say that you received an event indicating that the shipping route for that ZIP code has changed. In my experience, such events rarely include enough information to describe how the entity would change for every possible combination it could have been retrieved with. E.g you're unlikely to get a payload saying: "Here’s what the shipping route now looks like when IncludeAlternativeShippingRoutes is true, and here’s what it looks like when it's false."
While this example has only two permutations, real-world use cases often involve multiple query parameters - and not just booleans. In such cases, it becomes nearly impossible for the library to guess the full cache key based only on the ID, and as I said earlier the event would probably lack sufficient data to update every relevant cache entry either way.
So, what should we do when we receive such an event? The only practical solution, in my opinion, is to delete every cache key that contains the ZIP code, ensuring that the data is re-fetched from the original data source the next time GetShippingRoute is called with that particular code. However, this too can be problematic - especially if you’re using a shared sturdyc cache across multiple data sources (or even using multiple sturdyc caches backed by the same distributed storage). In that case, deleting every key containing 12345 might remove unrelated objects that also use numerical IDs.
This would also require an interface for the distributed storage that supports partial key deletion, as well as a new method on the cache. Perhaps something like UpsertDelete(partialKey string) since we want to avoid performing regular deletes with O(N) runtime complexity
This overlaps somewhat with issue #43, as achieving more consistent and predictable cache keys would be essential for better support of manual upserts and scenarios like this should be considered when reworking the logic around cache keys.
We could quite easily update the Get and Set functions to also do writes to the distributed storage if we have one though. I'm just not sure if I want to encourage that because it's easy to make misstakes unless you're familiar with how the library works internally