Add response attribute to indicate when content changes
Hi,
if a cached page expires and a new one is fetched it would be interesting to know if the new content differs from the cached one.
In my own view, from_cache response attribute should be True if the page content has not changed, regardless whether the cache had expired and a new page was fetched or not.
What do you think about?
I think what you're describing is best handled by conditional requests. For any servers that support it, requests-cache will send a conditional request, and if the remote content hasn't changed, from_cache will still be True because no new data was received from the server. Here's an example: https://requests-cache.readthedocs.io/en/stable/user_guide/headers.html#conditional-requests
Otherwise, from_cache is meant to indicate where a given response object came from, not necessarily what the contents are. Do you have a case where you want to do something only when response content changes?
Another relevant piece of info: in 1.0 (beta), there is now also a CachedResponse.revalidated attribute that indicates if the response was revalidated by a conditional request.
I'll close this issue for now, but let me know if you have any other questions.
Otherwise,
from_cacheis meant to indicate where a given response object came from, not necessarily what the contents are. Do you have a case where you want to do something only when response content changes?
Well, working with scrapers I find it hard to imagine a use case when knowing if the content has actually changed is irrelevant: if a response content has not changed since the last process, I can avoid reprocessing it and that's a great improvement by itself. An attribute like CachedResponse.has_content_changed would help a lot.
I see. Adding a new attribute would be reasonable. I'll keep this open, then.