mongoid-cached-json icon indicating copy to clipboard operation
mongoid-cached-json copied to clipboard

Changes to `json_fields` don't invalidate cached results

Open joeyAghion opened this issue 10 years ago • 7 comments

When we deploy a new version of our codebase that includes changes to a json_fields declaration (e.g., adding a new field), any cached data for that model should be invalidated. Instead, the stale data continues to be returned and a full cache clear is required before the correct fields are returned.

This could take the form of an additional version option that's simply incremented by developers upon making changes that should invalidate cached results. (I'm not sure how to cleanly offer the option without potentially colliding with the model's actual field definitions, but it seems we already do something like this with hide_as_child_json_when). A more sophisticated approach might be to include a hash of the json_fields definition in the cache keys. This would change when the definition changes, automatically invalidating any previously cached data.

joeyAghion avatar Mar 26 '15 13:03 joeyAghion

:+1:

dblock avatar Mar 26 '15 18:03 dblock

It seems to me that the current versioning system allows for this.. do you just not want to increment the version of each item in the json_fields block?

wrgoldstein avatar Apr 07 '15 16:04 wrgoldstein

@wrgoldstein I think you're correct that that would work, but it's just impractical to declare a new version and update all the as_json calls each time the schema changes. I was hoping for something that would "just work" (hopefully without being too complex).

joeyAghion avatar Apr 14 '15 16:04 joeyAghion

@wrgoldstein and I worked on this a little. Automatically "fingerprinting" the json_fields declaration is impractical because of all the lambdas... I don't know of a way to consistently digest their definitions into a string representation. Alternatives include:

  1. Fingerprint just a subset of the definition (e.g., just the keys or the declaration excluding lambdas)
  2. Add an optional version to the whole json_fields declaration (requiring developers to add/increment the version when they intend to invalidate previously-cached values)
  3. Do nothing

Let me know if you think any of these are worthwhile. I'm not sure (1) or (2) are worth the complexity.

joeyAghion avatar Apr 15 '15 18:04 joeyAghion

I vote to do nothing, however this is an interesting problem. The original problem is that we have new code and are serving old data. There're two scenarios:

Clients Must Change

This is a schema change to JSON responses, so I think you want to treat it as a migration, and invalidate the appropriate caches part of the deployment. The more atomic alternative is (2) IMO, (1) seems like a lot of work and hard to debug cache invalidations on deployment.

Clients Roll Change

Clients are OK with old data but we don't want to wait our cache invalidation lifetime (eg. 24 hours). A possible "solution" is to create a strategy by which upon deployment lifetime of objects is decreased, so maybe caches will invalidate over the next 15 minutes instead of 24, but still not at once.

dblock avatar Apr 16 '15 11:04 dblock

This is a great issue.

What I've generally done (and of course always forgotten :smile: ), is to explicitly invalidate the cache for any affected object where I've changed the JSON representation. I agree that this is a lot of work - modifying the json for AdditionalImage means busting the cache for every image!

In terms of your second point, I would be willing to rely on that, however I regularly see objects that persist in the cache for much longer than 24 hours.

For example, I deploy a code changing the JSON schema for a class, yet several days after that deploy, I'm still seeing stale JSON coming thru the cached json. Shouldn't that not happen if the cache was being expired?

I don't see where (in this gem), an expires_in or such type of parameter is being used, and the default Rails.cache is to never expire unless explicitly told so, and to instead rely on normal cache eviction by your caching server.

mzikherman avatar Aug 19 '15 21:08 mzikherman

So I guess my question (hit 'Enter' too early above), is what's the proper way to use cache expiration time here? The shortest cache we can get away with the better, and I have a hunch we're currently (in production) not expiring this cache (we are invalidating however).

I think we can fix our specific issue by modifying our config/memcached.yml to include an expires_in set from an ENV var, for example.

mzikherman avatar Aug 19 '15 21:08 mzikherman