Execute prepared statements on client side by default
Draft PR to discuss approach for #306
Just to complete the information for @hashhar.
We discussed the following approaches
-
We could also avoid the prepare and deallocation query by simply adding the prepared statement directly on the
ClientSession. The advantage is that we can still make use of the query parser on the server side. The disadvantage is that we still use the headers as transport. -
Do the parsing on the clientside. then we have to take care of values and identifiers within the query that contain
?. eg.cur.execute("SELECT '?''?', ?", 'test')orcur.execute('select 1 as "?""test", ?', 'test') -
Transport the prepared statement parameters in the JSON body instead of the headers as a backwards compatible API change, that can be used by all clients instead of just the trino python client
Doing on the client side makes the most sense to me depending on what problem we're trying to solve.
1 (directly manipulate ClientSession) seems like a hack. 2 (client side) solves the problem of multiple queries but introduces security concerns. 3 (new backward compatible API) solves the limitation that prepared statements cannot be very large but it doesn't help the fact that multiple queries need to be sent anyway.
I think we can do the following:
- Make
paramstylesettable and only allow it to be set to eitherformatorqmarkwithqmarkas default. - When we create Connection we "freeze" the paramstyle value i.e. a Connection's paramstyle is constant.
-
paramstyle=formatwould do client side bindings,paramstyle=qmarkwould do server side (what is already done today) - For paramstyle = format it's easier to do client side binding since you no longer need to worry about "finding" all markers and can just use
str % valuesfrom Python to do the value setting.
For paramstyle = format we need to:
- escape all params only - query text needs no escaping - if something needs escaping there (
%) then the user should do that - this is what all Python drivers do (I checked psycopg2, pymysql, Oracle). - Do
query % escaped_valuesand execute the query.
Where the handling of qmark and format style needs to be done? - to me it seems the logical place is to have two Cursor classes and override their execute and executemany methods. Alternatively we can keep existing cursor and introduce two methods process_qmark_params and process_format_params and call them from execute and executemany.
The question we need to answer
What's the biggest problem we want to solve?
- multiple queries sent when using params?
- the size limitation of prepared statements?
If we want to solve 1 then the only solution is what we discuss above. If we want to solve 2 then the API change that @mdesmet mentioned in his comment is a good solution.
To be honest I don't know which of these problems people actually care about.
The benefit of customizable paramstyle is that we can solve both problems at once but at the cost of some additional code and complexity (both for users and for us).