Setting max-age on the response causes cachecontrol to fail with exception about 'expires' argument
Applies to cachecontrol 0.12.7 and probably 0.12.8
When we have a response with following header: 'Cache-Control': 'public, max-age=1'
I believe this may be related to this PR: https://github.com/ionrock/cachecontrol/pull/233
We get the following exception from cachecontrol:
TypeError: set() got an unexpected keyword argument 'expires'
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
/usr/local/lib/python3.9/site-packages/requests/sessions.py:555: in get
return self.request('GET', url, **kwargs)
/usr/local/lib/python3.9/site-packages/requests/sessions.py:542: in request
resp = self.send(prep, **send_kwargs)
/usr/local/lib/python3.9/site-packages/requests/sessions.py:697: in send
r.content
/usr/local/lib/python3.9/site-packages/requests/models.py:836: in content
self._content = b''.join(self.iter_content(CONTENT_CHUNK_SIZE)) or b''
/usr/local/lib/python3.9/site-packages/requests/models.py:758: in generate
for chunk in self.raw.stream(chunk_size, decode_content=True):
/usr/local/lib/python3.9/site-packages/urllib3/response.py:576: in stream
data = self.read(amt=amt, decode_content=decode_content)
/usr/local/lib/python3.9/site-packages/urllib3/response.py:519: in read
data = self._fp.read(amt) if not fp_closed else b""
/usr/local/lib/python3.9/site-packages/cachecontrol/filewrapper.py:96: in read
self._close()
/usr/local/lib/python3.9/site-packages/cachecontrol/filewrapper.py:76: in _close
self.__callback(result)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = <cachecontrol.controller.CacheController object at 0x7f30ba331640>
request = <PreparedRequest [GET]>
response = <urllib3.response.HTTPResponse object at 0x7f30ba3179a0>, body = b''
status_codes = None
def cache_response(self, request, response, body=None, status_codes=None):
"""
Algorithm for caching requests.
This assumes a requests Response object.
"""
# From httplib2: Don't cache 206's since we aren't going to
# handle byte range requests
cacheable_status_codes = status_codes or self.cacheable_status_codes
if response.status not in cacheable_status_codes:
logger.debug(
"Status code %s not in %s", response.status, cacheable_status_codes
)
return
response_headers = CaseInsensitiveDict(response.headers)
if 'date' in response_headers:
date = calendar.timegm(
parsedate_tz(response_headers['date'])
)
else:
date = 0
# If we've been given a body, our response has a Content-Length, that
# Content-Length is valid then we can check to see if the body we've
# been given matches the expected size, and if it doesn't we'll just
# skip trying to cache it.
if (
body is not None
and "content-length" in response_headers
and response_headers["content-length"].isdigit()
and int(response_headers["content-length"]) != len(body)
):
return
cc_req = self.parse_cache_control(request.headers)
cc = self.parse_cache_control(response_headers)
cache_url = self.cache_url(request.url)
logger.debug('Updating cache with response from "%s"', cache_url)
# Delete it from the cache if we happen to have it stored there
no_store = False
if "no-store" in cc:
no_store = True
logger.debug('Response header has "no-store"')
if "no-store" in cc_req:
no_store = True
logger.debug('Request header has "no-store"')
if no_store and self.cache.get(cache_url):
logger.debug('Purging existing cache entry to honor "no-store"')
self.cache.delete(cache_url)
if no_store:
return
# https://tools.ietf.org/html/rfc7234#section-4.1:
# A Vary header field-value of "*" always fails to match.
# Storing such a response leads to a deserialization warning
# during cache lookup and is not allowed to ever be served,
# so storing it can be avoided.
if "*" in response_headers.get("vary", ""):
logger.debug('Response header has "Vary: *"')
return
# If we've been given an etag, then keep the response
if self.cache_etags and "etag" in response_headers:
expires_time = 0
if response_headers.get('expires'):
expires = parsedate_tz(response_headers['expires'])
if expires is not None:
expires_time = calendar.timegm(expires) - date
expires_time = max(expires_time, 14 * 86400)
logger.debug('etag object cached for {0} seconds'.format(expires_time))
logger.debug("Caching due to etag")
self.cache.set(
cache_url,
self.serializer.dumps(request, response, body),
expires=expires_time
)
# Add to the cache any permanent redirects. We do this before looking
# that the Date headers.
elif int(response.status) in PERMANENT_REDIRECT_STATUSES:
logger.debug("Caching permanent redirect")
self.cache.set(cache_url, self.serializer.dumps(request, response, b''))
# Add to the cache if the response headers demand it. If there
# is no date header then we can't do anything about expiring
# the cache.
elif "date" in response_headers:
date = calendar.timegm(
parsedate_tz(response_headers['date'])
)
# cache when there is a max-age > 0
if "max-age" in cc and cc["max-age"] > 0:
logger.debug("Caching b/c date exists and max-age > 0")
expires_time = cc['max-age']
> self.cache.set(
cache_url,
self.serializer.dumps(request, response, body),
expires=expires_time
)
E TypeError: set() got an unexpected keyword argument 'expires'
/usr/local/lib/python3.9/site-packages/cachecontrol/controller.py:354: TypeError
CC @ionrock I suspect this one is straightforward, but am not sure if there's any other supporting information needed here? (other library versions etc)
@svanoort I'm not able to reproduce this in a test with the headers you mentioned. Do you mind sending the actual response headers? Also, are you using any heuristics or a custom cache implementation?
@svanoort what backend are you using?
Using diskcache's FanoutCache here and had to shim it:
class FanoutCache(_FanoutCache):
def set(self, key, value, **kwargs):
kwargs['expire'] = kwargs.pop('expires', None)
return super().set(key, value, **kwargs)
Or can just pop and drop it to not expire in the backend
I'm testing new release with pip and had the same issue, I had to add a expires keyword to pip's custom cache:
https://github.com/itamarst/pip/commit/99a0bcf542c07fa6ccc3d2e198f4d593fec7f1b6
I'm having the same issue with PR #233
This PR breaks those of us who use fasteners instead of the deprecated and not updated since 2015 lockfile module. Both fasteners and oslo.concurrency are recommended replacements:
Note: This package is deprecated. It is highly preferred that instead of using this code base that instead fasteners or oslo.concurrency is used instead.
However, neither of those accepts an expires= argument on their .lock() method interface.
See also https://github.com/ionrock/cachecontrol/issues/109 which I appear to have opened a few years ago. 😁
An alternative referenced there is https://github.com/tox-dev/py-filelock.git - which also does not support the concept of expiry.
i don't know when this was introduced, but this also causes a regression in feed2exec: https://gitlab.com/anarcat/feed2exec/-/issues/22
basically, the set parameter to the cache backend now should expect an expires parameter, which is new. it's not mentioned in the release notes, for what it's worth, and it doesn't seem like a minor version bump signaled the API change either... 0.12.6 worked fine and 0.12.10 (or before?) broke it.
i patched it with: https://gitlab.com/anarcat/feed2exec/-/commit/c5563d093cfd6e86123438cacee9358ac491cbc7.patch
but it would be nice to have a little heads up for things like this. :)
oh, and thanks for implementing this feature, quite nice!