things-cli fails with "sqlite3.OperationalError: unable to open database file" when listing a large number of todos (~2500)
Note: solved, with an upcoming pull request, but filing here as a bug report for reference.
To Reproduce Steps to reproduce the behavior:
- Create 2500 or so tasks (don't ask 😬)
- Run
things-cli todos
Expected behavior It runs successfully
Observed behaviour It fails with the following:
Traceback (most recent call last):
File "/Users/matt/.local/share/mh/uv-tool-bin/things-cli", line 8, in <module>
File "/Users/matt/.local/share/uv/tools/things-cli/lib/python3.13/site-packages/things_cli/cli.py", line 487, in main
File "/Users/matt/.local/share/uv/tools/things-cli/lib/python3.13/site-packages/things_cli/cli.py", line 406, in main
File "/Users/matt/.local/share/uv/tools/things-cli/lib/python3.13/site-packages/things_cli/cli.py", line 422, in main
File "/Users/matt/.local/share/uv/tools/things-cli/lib/python3.13/site-packages/things_cli/cli.py", line 479, in parse_command
File "/Users/matt/.local/share/uv/tools/things-cli/lib/python3.13/site-packages/things/api.py", line 458, in todos
File "/Users/matt/.local/share/uv/tools/things-cli/lib/python3.13/site-packages/things/api.py", line 206, in tasks
File "/Users/matt/.local/share/uv/tools/things-cli/lib/python3.13/site-packages/things/database.py", line 395, in get_tags
File "/Users/matt/.local/share/uv/tools/things-cli/lib/python3.13/site-packages/things/database.py", line 429, in get_tags_of_task
File "/Users/matt/.local/share/uv/tools/things-cli/lib/python3.13/site-packages/things/database.py", line 500, in execute_query
sqlite3.OperationalError: unable to open database file
things-cli todos 0.09s user 0.02s system 61% cpu 0.178 total
Python: 3.13.0 things-cli: version 0.2.1
Root cause appears to be that each SQL query creates a new connection which is never closed, and so when querying tags for each task, the number of open connections is at least as high as the number of tasks.
Working fix is to close the connection each time after each query.
Actually, the tests with Python 3.13 warn about:
things/database.py:512: ResourceWarning: unclosed database in <sqlite3.Connection object at 0x101f0cf40> cursor.execute(sql_query, parameters) Object allocated at (most recent call last): File "things/database.py", lineno 191
See details at https://github.com/thingsapi/things.py/actions/runs/11765369280/job/32771604833#step:5:37
Thanks, in that case I think it'd make sense to roll back the second of those commits, as the intention of that second commit was as an optional performance improvement to leave the connection open (separate transactions, but shared connection), and that warning makes it clear that it's bad practice to do so.
I.e. this later of the commits would need to be rolled back: https://github.com/thingsapi/things.py/pull/126/commits/bf429cdc837f60dea2cb5898449ac8420a31efd5
I tested both versions (with just the first commit and with both), and while sharing a connecting did speed things up (75% or so faster IIRC), it's already very fast even on large Things databases.
Would you prefer a PR for the revert of that second commit above, or prefer to revert it directly?
Btw thank you for this great tool!
Thanks. A PR that make sure that make test does not throw any warnings would be nice.
@mbhutton @AlexanderWillner Thank you for your work here. 75% faster! 🥳
@AlexanderWillner Can we close this?
I think no:
things.py/things/database.py:711: ResourceWarning: unclosed database in <sqlite3.Connection object at 0x101f246d0> for index, column in enumerate(cursor.description): Object allocated at (most recent call last): File "things.py/things/database.py", lineno 188 self.connection = sqlite3.connect(uri, uri=True) # pylint: disable=E1101
It might be faster but might leak memory now.
@AlexanderWillner Got it. Want to look into a quick fix?
I think the quick fix is mentioned in https://github.com/thingsapi/things.py/issues/125#issuecomment-2475452049 - I might come to this eventually, however, this might take a while
@AlexanderWillner 👍. @mbhutton Is this something you would enjoy to take a look at?
I can take a look in about two weeks. Expecting to just roll back that second speed up commit. Ie leaving in the origin fix but losing the reuse of the connections. My hunch is that reusing the connections in a safe way that gets rid of those warnings would require more invasive changes on both things.py and things-cli, which seems unnecessary given that both already seem plenty fast. Dumping my whole large db via things-cli is already really fast. Either way expect an update in two weeks or so.
@mbhutton Sounds good. 👌
@AlexanderWillner @mikez Rather than backing out the connection-reuse change, I've raised a PR to close the connection during GC: #132
This addresses the ResourceWarning lint introduced in Python 3.13 which was polluting those test results, but in a way that maintains the speed up gained by reusing the connection.
Closing as fixed, as the associated PR is merged: #132