unleash-client-python icon indicating copy to clipboard operation
unleash-client-python copied to clipboard

please ditch fcache as it does not work with multiprocessing on Windows

Open masariello opened this issue 1 year ago • 3 comments

Describe the bug

fcache has a known issue with multiprocessing (tsroten/fcache#26) that has been ignored tor 5 years.

Here's an example of the errors

2024-03-06 10:54:33.478 - ERROR - apscheduler.executors.unleash_executor_YTIN7P - run_job - Job "fetch_and_load_features (trigger: interval[0:01:00], next run at: 2024-03-06 10:55:32 CST)" raised an exception
Traceback (most recent call last):
  File "C:\Users\userName\AppData\Roaming\Python\Python39\site-packages\apscheduler\executors\base.py", line 125, in run_job
    retval = job.func(*job.args, **job.kwargs)
  File "C:\Users\dchuserNameoi\AppData\Roaming\Python\Python39\site-packages\UnleashClient\periodic_tasks\fetch_and_load.py", line 43, in fetch_and_load_features
    cache.set(ETAG, etag)
  File "C:\Users\userName\AppData\Roaming\Python\Python39\site-packages\UnleashClient\cache.py", line 129, in set
    self._cache[key] = value
  File "C:\Users\userName\AppData\Roaming\Python\Python39\site-packages\fcache\cache.py", line 267, in __setitem__
    self._write_to_file(filename, value)
  File "C:\Users\userName\AppData\Roaming\Python\Python39\site-packages\fcache\cache.py", line 254, in _write_to_file
    os.chmod(filename, self._mode)
OSError: [WinError 6800] The function attempted to use a name that is reserved for use by another transaction: 'C:\\Users\\userName\\AppData\\Local\\PROD\\PROD\\Cache\\cache\\65746167'

To Reproduce

Please see StackOverflow linked in tsroten/fcache#26

masariello avatar Mar 06 '24 10:03 masariello

Hey @masariello, this probably isn't something that'll get resolved in the next few weeks. But we do have tentative plans to rework the caching anyway due to an upcoming project on this SDK

Is this actually affecting you, outside of causing trouble in the logs? Writing to backup, shouldn't be a process that needs to happen to ensure the SDK works correctly

sighphyre avatar Mar 13 '24 14:03 sighphyre

Yes. I hit the problem reproduced in the linked fcache issues

In fact, fcache has numerous issues when running with multiprocessing. It just opens and writes files on disk without any locking or retry logic, so every now and then we do get exceptions caused by those concurrent accesses.

The only way to avoid those issues is replacing fcache with a dict-based BaseCache

masariello avatar Mar 13 '24 21:03 masariello

As an extra motivation, fcache also isn't compatible with many serverless environments out-of-the-box. For example, in AWS lambda, the home directory is mounted on a read-only file system and fcache defaults to $HOME/.cache/ as the base path for cache files. This can be worked around by setting XDG_CACHE_HOME to something like /tmp/cache/, but it makes this package slightly more cumbersome to use in a serverless deployment.

calum-pledge-io avatar May 02 '24 10:05 calum-pledge-io