python-diskcache icon indicating copy to clipboard operation
python-diskcache copied to clipboard

Readonly cache.db

Open audetto opened this issue 6 years ago • 14 comments

Hi

is it possible to enforce a readonly cache.db database? I am running with default options and although it is not supposed to modify the db, at the end the db file is different (and I am only reading from the cache).

Regards

audetto avatar Jun 07 '19 18:06 audetto

I don't think that's currently possible but it seems like a good thing to guarantee. I think it's probably the settings that are being updated to their existing values. During Cache initialization, you'll see INSERT OR REPLACE INTO Settings. I'm not sure what other statements make modifications.

Can you explain the use case? Is it a general concern or does it block something else?

grantjenks avatar Jun 07 '19 19:06 grantjenks

Here is the error if I make the file readonly

Traceback (most recent call last): File "dc.py", line 3, in cache = diskcache.Cache('/tmp/peppo') File "/tmp/pppp/lib64/python3.7/site-packages/diskcache/core.py", line 432, in init sql(query, (key, value)) File "/tmp/pppp/lib64/python3.7/site-packages/diskcache/core.py", line 597, in _execute_with_retry return sql(statement, *args, **kwargs) sqlite3.OperationalError: attempt to write a readonly database

audetto avatar Jun 08 '19 13:06 audetto

So my use case: my app connects to live data sources, so I use disk cache for 2 purposes:

  1. speed up live operation
  2. run unit tests

What happens is that after a succesful test run I always see the .db file modified which causes unwated confusion in git gui. Has the db changed due to an application but, or is it just noise.

I can see a couple of avenues

  1. new flag to the constructor: read-only, any attempt to write will fail and the db is not touched
  2. only write settings if they have changed, the rest of the operations is unaffected

Andrea

audetto avatar Jun 08 '19 13:06 audetto

Are you including the cache in your git repository? I think adding it to .gitignore would be better.

grantjenks avatar Jun 14 '19 00:06 grantjenks

Yes, as long as I know I have not really changed it. But every so often I do actually make a change I want to commit.

audetto avatar Jun 14 '19 19:06 audetto

Is this feature of any interest at all?

audetto avatar Oct 09 '19 21:10 audetto

Yes, it’s interesting. I tried to code a solution but got stuck (for a reason I can’t recall now). I think the issue was around pragmas that are set in the initializer. The cache initializer code has historically been a bug farm so changes are risky. It’s not a feature that I need per se so I stopped thinking about it.

As I think about it this morning, maybe there’s fast-pass logic that could be added to the initializer as a workaround. You could try inheriting and overriding that method for experimentation.

grantjenks avatar Oct 10 '19 14:10 grantjenks

https://github.com/grantjenks/python-diskcache/pull/127

Here is a simple example.

audetto avatar Oct 10 '19 18:10 audetto

See the query-only-support branch for work-in-progress.

grantjenks avatar Oct 13 '19 23:10 grantjenks

I saw it. There is some tuple issue now in the comparison, but more importantly:

  • There will have to be some exceptions to the checks being made.
  • The settings "count" and "sqlite_query_only" are out of sync.
  • And there is still the question of the "tag_index" which I did not really understand.

audetto avatar Oct 14 '19 07:10 audetto

Search the tutorial for “tag index”. It’s described there and in the api.

All the metadata settings are liable to get out of sync. This is also the first setting that is incompatible with others: e.g. query only and some eviction policies.

The branch also needs tests.

grantjenks avatar Oct 14 '19 14:10 grantjenks

I am also interested in this feature. We have a snapshot of the cache.db which is holding data for unittests. We rarely update it, but every time I run unittests it seems to have changed (the actual data did not change, only some database metadata I think).

Is this being worked on?

orcunderscore avatar Nov 15 '23 09:11 orcunderscore

Not actively, no. I believe the branch referenced above still exists though.

grantjenks avatar Nov 15 '23 14:11 grantjenks

I am also interested in this feature. We have a snapshot of the cache.db which is holding data for unittests. We rarely update it, but every time I run unittests it seems to have changed (the actual data did not change, only some database metadata I think).

Is this being worked on?

This was exactly my case when I submitted it. Now we wrap DiskCache in a db-read only object until the 1st write operation (a lot of non-needed complexity).

At the time, there was little or no interest in this feature so, as you can see, nothing happened.

audetto avatar Nov 15 '23 20:11 audetto