sqlite-jdbc icon indicating copy to clipboard operation
sqlite-jdbc copied to clipboard

PreparedStatement#getParameterMetadata()#getParameterType() results in string type, context type

Open rsmckinney opened this issue 2 years ago • 20 comments

PreparedStatement#getParameterMetadata()#getParameterType() results in string (varchar) type, regardless of the parameter's context type.

For instance,

SELECT * FROM City WHERE country_id = ?

Here country_id is an unsigned int, however the call to getParameterType(1) returns java.sql.Types#VARCHAR. This behavior is at odds with type-safe languages that use this metadata for code generation.

Environment (please complete the following information):

  • OS: Windows 10, x86_64
  • sqlite-jdbc version [e.g. 3.39.2.0]

rsmckinney avatar Jun 24 '23 04:06 rsmckinney

How did you set the parameter?

gotson avatar Jun 24 '23 06:06 gotson

It is not set. The purpose of the call is to determine the type of the parameter for code generation, not for executing a query.

rsmckinney avatar Jun 24 '23 06:06 rsmckinney

Never mind. I see in latest release JDBC3PreparedStatement now returns the correct type. Thanks.

rsmckinney avatar Jun 24 '23 06:06 rsmckinney

Oh wait...

It looks like the driver now requires the parameter to be set? I would expect the driver to return the type in context of the parameter, particularly when the parameter is not set.

rsmckinney avatar Jun 24 '23 06:06 rsmckinney

Oh wait...

It looks like the driver now requires the parameter to be set? I would expect the driver to return the type in context of the parameter, particularly when the parameter is not set.

i don't quite understand how you want to return a parameter that was not set.

gotson avatar Jun 24 '23 07:06 gotson

I want the parameter type, not the argument type. Referring back to the original example:

SELECT * FROM City WHERE country_id = ?

The ? is a parameter. Regardless of sqlite's internal dynamic typing, on the surface the parameter has a static type, just as country_id does. Thus, it's type, I would think, should reflect the context/expected type of the parameter. In this case the same type as country_id, which is an unsigned int.

Otherwise, this is a chicken/egg problem where the process of querying for a parameter's type involves supplying a sample value of the type I am asking for. As a consequence, the type is unknown, which is kind of disappointing when generating type-safe Java code.

Update: Looking at other JDBC drivers (H2, Oracle) they behave as expected; their implementation of getParameterType() and other methods is independent of set values as one would expect. It appears the sqlite implementation is incorrect.

rsmckinney avatar Jun 24 '23 17:06 rsmckinney

Can you show a complete code sample and comment what you expect where? Or maybe a code sample that works on another database driver?

gotson avatar Jun 25 '23 01:06 gotson

I think i understand what you want, but i don't think SQLite has the capabilities to provide that information. I am not familiar with the other drivers, and i don't know how they do it.

gotson avatar Jun 25 '23 01:06 gotson

The parsers of other drivers apply type inference algorithms to determine the types of parameters. For instance, an A = ? expression in a where clause infers the type of ? using the type of the left hand side of the operator.

Not a big deal, I had assumed it was a requirement of JDBC drivers to provide parameter metadata, as opposed to argument metadata. I'm glad I tested the code generator I'm writing to handle this case, particularly since Sqlite is widely used. But I'm switching my primary testing db from Sqlite to H2 because it behaves in a more standard way, and for other reasons that have built up eg. better standard SQL coverage.

Thanks for the quick response.

rsmckinney avatar Jun 25 '23 02:06 rsmckinney

There's no sql parser in this driver, unfortunately.

gotson avatar Jun 25 '23 04:06 gotson

Given the sqlite limitation, I believe this should revert to "VARCHAR" as was in previous versions. Right now we are still returning bad information, extracted from the setObject() call. So if the table is TEXT and we setObject(1), we get the wrong answer.

More over, under correct use which is to call getParameterType() without calling setObject() we get a NPE which breaks generic calling code (we found this out because are "auto type mapping code" which kind of worked for sqlite because everything can be converted into strings, started throwing exceptions).

I think the original implementation was the correct one for the case.

yuvalp-k2view avatar Jun 25 '23 11:06 yuvalp-k2view

I think you're right, we would need to revert https://github.com/xerial/sqlite-jdbc/pull/882

gotson avatar Jun 25 '23 14:06 gotson

Before you revert back to the old behavior, consider there are two kinds of metadata concerning a parameterized query.

  1. Query Definition. This is what I explained earlier where the parameter metadata is needed and, I assume, is the primary intention of the API.
  2. Query Invocation. This involves argument metadata, which tends to be less useful, but still relevant. This is what you currently have implemented.

It appears other drivers handle both aspects in their implementations. So, perhaps if you merge the old varchar implementation with the new behavior, you will satisfy both conditions, albeit minimally.

rsmckinney avatar Jun 25 '23 19:06 rsmckinney

Before you revert back to the old behavior, consider there are two kinds of metadata concerning a parameterized query.

  1. Query Definition. This is what I explained earlier where the parameter metadata is needed and, I assume, is the primary intention of the API.
  2. Query Invocation. This involves argument metadata, which tends to be less useful, but still relevant. This is what you currently have implemented.

It appears other drivers handle both aspects in their implementations. So, perhaps if you merge the old varchar implementation with the new behavior, you will satisfy both conditions, albeit minimally.

For the sake of clarity, are you able to clearly pinpoint which methods cover which case ?

gotson avatar Jun 26 '23 01:06 gotson

I had a look at https://docs.oracle.com/javase/8/docs/api/java/sql/ParameterMetaData.html and i believe all of the methods relate to the above point 1, which is about introspecting what the parameter needs to be, and for which sqlite doesn't provide any way of getting.

I don't see any methods related to above point 2, which is about what has been set as parameter value (argument).

gotson avatar Jun 26 '23 02:06 gotson

Testing the postgres jdbc driver, I can confirm (unexpectedly) that it is indeed a hybrid behaviour. The following test passes:

            execute(c, "CREATE TABLE A_TABLE (a_text  text, a_number numeric)");
            try (PreparedStatement ps = c.prepareStatement("INSERT INTO A_TABLE VALUES (?,?)")) {
                ParameterMetaData pmd = ps.getParameterMetaData();
                assertEquals(2, pmd.getParameterCount());
                assertEquals(Types.VARCHAR, pmd.getParameterType(1));
                assertEquals(Types.NUMERIC, pmd.getParameterType(2));
                ps.setObject(1, 1);
                assertEquals(Types.INTEGER, pmd.getParameterType(1));
            }

Looks like it is a valid approach to return VARCHAR for uninitialized variables but the actual instance type for initialized one. If we take that approach, it is important to revert to VARCHAR in case clearParameters() is called to keep consistent behaviour for reused prepared statements (btw- postgerss doesn't do this correctly, reverting to Types.OTHER).

yuvalp-k2view avatar Jun 26 '23 06:06 yuvalp-k2view

Looks like it is a valid approach to return VARCHAR for uninitialized variables but the actual instance type for initialized one. If we take that approach, it is important to revert to VARCHAR in case clearParameters() is called to keep consistent behaviour for reused prepared statements (btw- postgerss doesn't do this correctly, reverting to Types.OTHER).

Thanks for the test and analysis, very useful.

gotson avatar Jun 28 '23 07:06 gotson

Just to be clear, postgres correctly infers the type of uninitialized parameters e.g., NUMERIC for the second parameter above. My suggestion for sqlite to merge past and present behavior was a compromise; better to return a type than to throw an exception. However, if you do merge the two behaviors, may I suggest you return JAVA_OBJECT for the uninitialized case instead of VARCHAR?

The problem with VARCHAR is that it translates to java.lang.String, which is confusing, particularly wrt generated, statically typed code. For example, if a parameter corresponds with an integer column, it confuses users of the parameter expecting to pass an integer or, worse, results in users supplying the wrong information. This issue is most relevant when the integer corresponds with an ID / foreign key. Worse yet, VARCHAR forces the user to perform explicit coercion to string from whatever correctly typed argument they would otherwise use directly; toString() can often lead unintentional parameter values.

If instead uninitialized parameters are typed as JAVA_OBJECT, the generated parameter type java.lang.Object accepts any value, which is less confusing esp. regarding ID / foreign key types. Typically argument values tend to be "ready made" and come from some other type-safe query result, or is otherwise user-supplied and validated from schema metadata type information. It's odd when the almost always correctly typed value conflicts with the otherwise VARCHAR parameter. With JAVA_OBJECT it "just works" and coercion is performed inside PreparedStatement#setObject(int, Object) where it either safely coerces the value or throws an exception to prevent bad data from entering the query.

Just my two cents here.

Update: I would even say it's better to keep the behavior you have now where you throw an exception when values are not set. At least this way I can decide what to do, like use java.lang.Object. Otherwise, if you return VARCHAR, I don't know that what you really mean is "you don't know." I'd rather you say that.

rsmckinney avatar Jul 02 '23 20:07 rsmckinney

Since sqlite is extremely flexible in converting strings to its supported types, my vote still goes to VARCHAR

yuvalp-k2view avatar Jul 03 '23 06:07 yuvalp-k2view

Right… but problems arise when the param value is not already a string.

Generally, metadata should either provide the true type to the best of its knowledge or acknowledge that the type is unknown. Reporting that the type is a String when it’s not is confusing wrt type-safe code, which is why I prefer the current behavior because it’s basically saying “I don’t know”, which is much more helpful and honest. shrug

rsmckinney avatar Jul 03 '23 06:07 rsmckinney