databricks-sql-python icon indicating copy to clipboard operation
databricks-sql-python copied to clipboard

Pandas makes bad DESCRIBE query when using SQLAlchemy

Open freud14-tm opened this issue 2 years ago • 4 comments

When using the SQLAlchemy engine with Pandas, it seems that Pandas makes a bad DESCRIBE query. Here is the code:

import os

import pandas as pd

from sqlalchemy import create_engine


server_hostname = os.getenv("DATABRICKS_SERVER_HOSTNAME")
http_path = os.getenv("DATABRICKS_HTTP_PATH")
access_token = os.getenv("DATABRICKS_TOKEN")
engine = create_engine(
    f"databricks://token:{access_token}@{server_hostname}?http_path={http_path}&catalog=hive_metastore&schema=default",
)
with engine.connect() as connection:
    print(pd.read_sql("SELECT * FROM test", connection))

Here are the two query resulting from that code: image

It does not do that when using SQL connector instead:

import os

import pandas as pd

from databricks import sql


with sql.connect(
    server_hostname=os.getenv("DATABRICKS_SERVER_HOSTNAME"),
    http_path=os.getenv("DATABRICKS_HTTP_PATH"),
    access_token=os.getenv("DATABRICKS_TOKEN"),
) as connection:
    print(pd.read_sql("SELECT * FROM test", connection))

Here are version numbers:

In [1]: import sqlalchemy

In [2]: sqlalchemy.__version__
Out[2]: '1.4.49'

In [3]: from databricks import sql

In [4]: sql.__version__
Out[4]: '2.8.0'

freud14-tm avatar Aug 01 '23 19:08 freud14-tm

Also, it would be nice if catalog and schema were optional.

freud14-tm avatar Aug 01 '23 19:08 freud14-tm

What version of pandas do you have installed?

susodapop avatar Aug 01 '23 20:08 susodapop

Was on 1.5.3 but just tried on 2.0.3 and get the same thing.

freud14-tm avatar Aug 01 '23 20:08 freud14-tm

@susodapop I have the same issue - has this been resolved in the newer versions?

saadali-e avatar Jan 17 '24 15:01 saadali-e