how to properly create a pool connection using SqlAlchemy?
we are trying to set up a dbpool connection for crate using SqlAlchemy (1.3.4), but we get the following error:
Traceback (most recent call last):
File "/Users/facca/.local/share/virtualenvs/ngsi-timeseries-api-x5Z2-yqd/lib/python3.8/site-packages/sqlalchemy/pool/base.py", line 697, in _finalize_fairy
fairy._reset(pool)
File "/Users/facca/.local/share/virtualenvs/ngsi-timeseries-api-x5Z2-yqd/lib/python3.8/site-packages/sqlalchemy/pool/base.py", line 893, in _reset
pool._dialect.do_rollback(self)
TypeError: do_rollback() missing 1 required positional argument: 'connection'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Users/facca/.local/share/virtualenvs/ngsi-timeseries-api-x5Z2-yqd/lib/python3.8/site-packages/sqlalchemy/pool/base.py", line 270, in _close_connection
self._dialect.do_close(connection)
TypeError: do_close() missing 1 required positional argument: 'dbapi_connection'
this how we create the connection:
...
from crate.client.sqlalchemy.dialect import CrateDialect
...
def get_connection(self):
url = "{}:{}".format(self.host, self.port)
connection = client.connect([url], error_trace=True)
return connection
...
dbpool = pool.QueuePool(self.get_connection, max_overflow=10, pool_size=5, pre_ping=False, dialect=CrateDialect)
self.connection = dbpool.connect()
...
self.connection.close()
the error bombs on closing the connection
here is a minimal code to reproduce the issue:
from crate import client
from crate.client.sqlalchemy.dialect import CrateDialect
import sqlalchemy.pool as pool
def get_connection():
url = "{}:{}".format("0.0.0.0", "4200")
connection = client.connect([url], error_trace=True)
return connection
dbpool = pool.QueuePool(get_connection, max_overflow=10, pool_size=5, dialect=CrateDialect, pre_ping=False)
connection = dbpool.connect()
cursor = connection.cursor()
cursor.execute("SELECT 1;")
cursor.close()
connection.close()
Dear Federico,
thanks for writing in. As promised, I already wanted to reach out to you on the QuantumLeap issue tracker at https://github.com/smartsdk/ngsi-timeseries-api/issues/397 about the topic of connection pooling, but missed out on that yet. Apologies!
First of all, I would recommend to use SQLAlchemy's create_engine() function in order to spin up a handle to the driver. Please let me know if you have any objections on that.
On the one hand, it will default to using a QueuePool already [1] and on the other hand, it will accept all available pool* configuration options like pool, poolclass, pool_size, pool_recycle, pool_timeout, pool_use_lifo, pool_reset_on_return, pool_pre_ping and max_overflow [2].
Regarding appropriate pool size configuration for CrateDB's SQLAlchemy dialect, we must also have to take into account that outbound connections are going through urllib3. For investigating relevant details, I recently created a gist at [3].
I believe it is equally important to adjust the HTTP connection pool size maintained by this library and this is now possible [4] by using the connect_args parameter [5] to create_engine(), thus saying things like
create_engine("crate://localhost:4200", connect_args={"pool_size": 10})
when using crate-python[sqlalchemy]>=0.26.0.
Please let me know if you see any improvements on QuantumLeap with these options.
With kind regards, Andreas.
[1] https://docs.sqlalchemy.org/en/13/core/engines.html#pooling [2] https://docs.sqlalchemy.org/en/13/core/engines.html#sqlalchemy.create_engine.params.pool [3] https://gist.github.com/amotl/575aae129beb17b120b848e7632365dc [4] https://github.com/crate/crate-python/pull/373 [5] https://docs.sqlalchemy.org/en/13/core/engines.html#sqlalchemy.create_engine.params.connect_args
@amotl thx, we will look in using create_engine and "raw" connection