PyTd
PyTd copied to clipboard
DB API cursor.description information is incorrect
Python DB API standards defines here what values a DB API compliant driver can return via cursor.description.
From what I observed:
-
precisionandscaleare wrong. The actualscalevalue is stored inprecisionandprecisionvalue is stored ininternal_size. -
internal_sizeattribute on other databases returns the amount of internal storage used by the column. So for example, column ofinttype would return 4,bigint8 and so on. -
internal_sizeactually has the values that are more appropriate fordisplay_sizeattribute (which is alwaysnull) - For SQL interval types, attribute
type_codefromcursor.descriptionreturnsstrbut in reality an instance ofteradata.datatypes.Intervalis returned. - For JSON types, again
stris stored intype_code, but the actual values are instances of pythondictorlist - When using
tdrestmodule, none of the metadata exceptnameandtype_codeare returned. It would be nice to have access to the same information thattdodbcmodule provides -
tdrestmodule doesn't supportxmltype at all whereastdodbchas no problems returningxmlas pythonstr
Here is a small Python program to reproduce above observations:
import teradata
from collections import OrderedDict
td_types = OrderedDict([
( "Char5" , "cast('abc' as char(5))" ),
( "VarChar5" , "cast('abc' as varchar(5))" ),
( "Int" , "cast(123456 as integer)" ),
( "BigInt" , "cast(1234567890123 as bigint)" ),
( "SmallInt" , "cast(1234 as smallint)" ),
( "ByteInt" , "cast(123 as byteint)" ),
( "Dec52" , "cast(123.56 as decimal(5,2))" ),
( "Dec384" , "cast(123.56 as decimal(38,4))" ),
( "Date" , "current_date" ),
( "time" , "current_time" ),
( "timestamp0" , "Current_timestamp(0)" ),
( "timestamp6" , "Current_timestamp(6)" ),
( "HourToMin" , "interval '05:06' hour to minute" ),
( "DayToMin" , "interval '03 05' day to hour" ),
( "YearToMonth", "interval '05-11' year to month" ),
( "JsonObj" , "cast('{\"ABC\":[1,2]}' as json)" ),
( "JsonArray" , "cast('[1,2]' as json)" ),
( "xml" , "cast('<a/>' as xml)" )])
# with teradata.tdrest.connect(host='xxxx', system='xxxx', username='xxxx', password='xxxx') as conn:
with teradata.tdodbc.connect(system='xxxx', username='xxxx', password='xxxx') as conn:
with conn.cursor() as csr:
fmt = '{:<20}|{:<38}|{:<28}|{:>5}|{:>10}|{:>4}|{:>5}|{:^4}'
print(fmt.format('name','py type','csr type','dsize','isize','prec','scale','null'))
print(fmt.replace('{:','{:-').format(*['']*8))
for name, expr in td_types.items():
csr.execute('SELECT {} AS Col_{}'.format(expr,name))
desc = csr.description[0]
col = csr.fetchone()[0]
line = [desc[0], type(col)] + list(desc[1:])
print(fmt.format(*[str(c) for c in line]))
I should add that the integral SQL data types are better represented by python int (also long in python2) and python float to represent SQL real data types instead of current implementation that uses decimal.Decimal for all numeric data types.