things.py icon indicating copy to clipboard operation
things.py copied to clipboard

things-cli fails with "sqlite3.OperationalError: unable to open database file" when listing a large number of todos (~2500)

Open mbhutton opened this issue 1 year ago • 13 comments

Note: solved, with an upcoming pull request, but filing here as a bug report for reference.

To Reproduce Steps to reproduce the behavior:

  1. Create 2500 or so tasks (don't ask 😬)
  2. Run things-cli todos

Expected behavior It runs successfully

Observed behaviour It fails with the following:

Traceback (most recent call last):
  File "/Users/matt/.local/share/mh/uv-tool-bin/things-cli", line 8, in <module>
  File "/Users/matt/.local/share/uv/tools/things-cli/lib/python3.13/site-packages/things_cli/cli.py", line 487, in main
  File "/Users/matt/.local/share/uv/tools/things-cli/lib/python3.13/site-packages/things_cli/cli.py", line 406, in main
  File "/Users/matt/.local/share/uv/tools/things-cli/lib/python3.13/site-packages/things_cli/cli.py", line 422, in main
  File "/Users/matt/.local/share/uv/tools/things-cli/lib/python3.13/site-packages/things_cli/cli.py", line 479, in parse_command
  File "/Users/matt/.local/share/uv/tools/things-cli/lib/python3.13/site-packages/things/api.py", line 458, in todos
  File "/Users/matt/.local/share/uv/tools/things-cli/lib/python3.13/site-packages/things/api.py", line 206, in tasks
  File "/Users/matt/.local/share/uv/tools/things-cli/lib/python3.13/site-packages/things/database.py", line 395, in get_tags
  File "/Users/matt/.local/share/uv/tools/things-cli/lib/python3.13/site-packages/things/database.py", line 429, in get_tags_of_task
  File "/Users/matt/.local/share/uv/tools/things-cli/lib/python3.13/site-packages/things/database.py", line 500, in execute_query
sqlite3.OperationalError: unable to open database file
things-cli todos  0.09s user 0.02s system 61% cpu 0.178 total

Python: 3.13.0 things-cli: version 0.2.1

Root cause appears to be that each SQL query creates a new connection which is never closed, and so when querying tags for each task, the number of open connections is at least as high as the number of tasks.

Working fix is to close the connection each time after each query.

mbhutton avatar Oct 27 '24 04:10 mbhutton

Actually, the tests with Python 3.13 warn about:

things/database.py:512: ResourceWarning: unclosed database in <sqlite3.Connection object at 0x101f0cf40> cursor.execute(sql_query, parameters) Object allocated at (most recent call last): File "things/database.py", lineno 191

AlexanderWillner avatar Nov 10 '24 12:11 AlexanderWillner

See details at https://github.com/thingsapi/things.py/actions/runs/11765369280/job/32771604833#step:5:37

AlexanderWillner avatar Nov 10 '24 13:11 AlexanderWillner

Thanks, in that case I think it'd make sense to roll back the second of those commits, as the intention of that second commit was as an optional performance improvement to leave the connection open (separate transactions, but shared connection), and that warning makes it clear that it's bad practice to do so.

I.e. this later of the commits would need to be rolled back: https://github.com/thingsapi/things.py/pull/126/commits/bf429cdc837f60dea2cb5898449ac8420a31efd5

I tested both versions (with just the first commit and with both), and while sharing a connecting did speed things up (75% or so faster IIRC), it's already very fast even on large Things databases.

Would you prefer a PR for the revert of that second commit above, or prefer to revert it directly?

Btw thank you for this great tool!

mbhutton avatar Nov 14 '24 05:11 mbhutton

Thanks. A PR that make sure that make test does not throw any warnings would be nice.

AlexanderWillner avatar Nov 15 '24 20:11 AlexanderWillner

@mbhutton @AlexanderWillner Thank you for your work here. 75% faster! 🥳

mikez avatar Dec 03 '24 11:12 mikez

@AlexanderWillner Can we close this?

mikez avatar Feb 11 '25 04:02 mikez

I think no:

things.py/things/database.py:711: ResourceWarning: unclosed database in <sqlite3.Connection object at 0x101f246d0> for index, column in enumerate(cursor.description): Object allocated at (most recent call last): File "things.py/things/database.py", lineno 188 self.connection = sqlite3.connect(uri, uri=True) # pylint: disable=E1101

It might be faster but might leak memory now.

AlexanderWillner avatar Feb 12 '25 09:02 AlexanderWillner

@AlexanderWillner Got it. Want to look into a quick fix?

mikez avatar Feb 12 '25 10:02 mikez

I think the quick fix is mentioned in https://github.com/thingsapi/things.py/issues/125#issuecomment-2475452049 - I might come to this eventually, however, this might take a while

AlexanderWillner avatar Feb 12 '25 11:02 AlexanderWillner

@AlexanderWillner 👍. @mbhutton Is this something you would enjoy to take a look at?

mikez avatar Feb 12 '25 22:02 mikez

I can take a look in about two weeks. Expecting to just roll back that second speed up commit. Ie leaving in the origin fix but losing the reuse of the connections. My hunch is that reusing the connections in a safe way that gets rid of those warnings would require more invasive changes on both things.py and things-cli, which seems unnecessary given that both already seem plenty fast. Dumping my whole large db via things-cli is already really fast. Either way expect an update in two weeks or so.

mbhutton avatar Feb 13 '25 19:02 mbhutton

@mbhutton Sounds good. 👌

mikez avatar Feb 13 '25 20:02 mikez

@AlexanderWillner @mikez Rather than backing out the connection-reuse change, I've raised a PR to close the connection during GC: #132

This addresses the ResourceWarning lint introduced in Python 3.13 which was polluting those test results, but in a way that maintains the speed up gained by reusing the connection.

mbhutton avatar Feb 22 '25 23:02 mbhutton

Closing as fixed, as the associated PR is merged: #132

mbhutton avatar Mar 02 '25 17:03 mbhutton