sqlalchemy-firebird icon indicating copy to clipboard operation
sqlalchemy-firebird copied to clipboard

Pandas `read_sql()` does not preserve the correct case of column names.

Open fdcastel opened this issue 2 years ago • 4 comments

Describe the bug

Pandas read_sql() (which uses SQLAlchemy) does not preserve the correct case of column names.

To Reproduce

import sqlalchemy as sa
import pandas as pd

e = sa.create_engine(db_uri)
with e.connect() as conn:
    q = 'SELECT 1 one, 2 TWO, 3 "three", 4 "FOUR", 5 "Five" from rdb$database'
    r = conn.exec_driver_sql(q)
    print([x for x, *rest in r.cursor.description])
    # ['ONE', 'TWO', 'three', 'FOUR', 'Five']

    df = pd.read_sql(sql=q, con=db_uri, dtype_backend="pyarrow")
    print(df.columns)
    # Index(['one', 'two', 'three', 'four', 'Five'], dtype='object')

Expected behavior

    print(df.columns)
    # Index(['ONE', 'TWO', 'three', 'FOUR', 'Five'], dtype='object')

Columns should reflect the exact same name from cursor.description.

fdcastel avatar Dec 18 '23 20:12 fdcastel

Unfortunately the root cause of this problem is in SQLAlchemy itself.

Related discussion.

SQLAlchemy 2.1 will bring a driver_column_names execution option to fix this.

fdcastel avatar Dec 21 '23 19:12 fdcastel

I'm looking forward to your pull request.

pauldex avatar Dec 25 '23 18:12 pauldex

@pauldex I have a workaround for this issue which uses an obscure legacy hook from SQLAlchemy. But I'm not comfortable putting it in the wild. That's why I decided to wait for SQLAlchemy 2.1 driver_column_names.

Are you in need of a solution to fix this issue?

fdcastel avatar Dec 25 '23 23:12 fdcastel

I think we can wait for SQLAlchemy 2.1 for the fix.

pauldex avatar Jan 06 '24 00:01 pauldex