snowflake-connector-python icon indicating copy to clipboard operation
snowflake-connector-python copied to clipboard

SNOW-353837: Percent signs in query results doubled

Open tobias-feil-by opened this issue 4 years ago • 3 comments

Please answer these questions before submitting your issue. Thanks!

  1. What version of Python are you using (python --version)? -- 3.8

  2. What operating system and processor architecture are you using (python -c 'import platform; print(platform.platform())')? --Linux-5.8.0-50-generic-x86_64-with-glibc2.29

  3. What are the component versions in the environment (pip freeze)?

asn1crypto==1.4.0
attrs==20.3.0
Authlib==0.15.3
azure-common==1.1.25
azure-core==1.8.2
azure-storage-blob==12.5.0
boto3==1.15.18
botocore==1.18.18
certifi==2020.6.20
cffi==1.14.3
chardet==3.0.4
click==7.1.2
coverage==5.5
cryptography==2.9.2
fastapi==0.63.0
greenlet==1.0.0
h11==0.12.0
idna==2.10
iniconfig==1.1.1
isodate==0.6.0
jmespath==0.10.0
msrest==0.6.19
numpy==1.20.2
oauthlib==3.1.0
oscrypto==1.2.1
packaging==20.9
pandas==1.1.5
pkg-resources==0.0.0
pluggy==0.13.1
py==1.10.0
pyarrow==3.0.0
pycparser==2.20
pycryptodomex==3.9.8
pydantic==1.8.1
PyJWT==1.7.1
pyOpenSSL==19.1.0
pyparsing==2.4.7
pytest==6.2.3
python-dateutil==2.8.1
pytz==2020.1
requests==2.25.1
requests-oauthlib==1.3.0
s3transfer==0.3.3
six==1.15.0
snowflake-connector-python==2.4.2
snowflake-sqlalchemy==1.2.4
SQLAlchemy==1.4.7
starlette==0.13.6
toml==0.10.2
typing-extensions==3.7.4.3
urllib3==1.25.11
uvicorn==0.13.4

  1. What did you do? Use SQLAlchemy ORM select to retrieve an entity like this: session.execute(select(MyClass)) The entity has some string fields that, among others, contain percent characters.

  2. What did you expect to see? The exact string values the way they are stored in the db

  3. What did you see instead? Every percent sign doubled. This is probably due to the class SQLAlchemyIdentifierPreparer not overriding the method _escape_identifier in its base class IdentifierPreparer and therefore doubling percent signs.

tobias-feil-by avatar May 06 '21 12:05 tobias-feil-by

@tobias-feil-by We will take a look at this.

sfc-gh-hkapre avatar Jul 15 '21 21:07 sfc-gh-hkapre

Just bumped into this issue, a fix would be appreciated

TheSamFed avatar May 11 '22 10:05 TheSamFed

hey guys, thanks for your patience -- I'd like to give an update on the issue:

I'm able to repro the issue and this should be an issue in the snowflake-sqlalchemy. we had a PR out for better sqlalchemy compliance which should address this issue: https://github.com/snowflakedb/snowflake-sqlalchemy/pull/314

I will let you know once the new version of snowflake-sqlalchemy is released.

sfc-gh-aling avatar Jul 13 '22 22:07 sfc-gh-aling

@sfc-gh-aling The issue still exists in snowflake-sqlalchemy==1.4.5

demidov91 avatar Jan 09 '23 17:01 demidov91

hey @demidov91 , thanks for reaching out, do you mind sharing your test script with me so that I could reproduce the debug?

sfc-gh-aling avatar Jan 09 '23 17:01 sfc-gh-aling

@sfc-gh-aling Thank you for a really quick response. After some more investigation I would say it may be not a bug. Here is what I mean:

t = sql.table('T', sql.column('A'), sql.column('B %'))
compiled = t.select().where(t.c['A'] > 50).compile(dialect=SnowflakeDialect())
print(compiled.string)

Output:

SELECT "T"."A", "T"."B %%" 
FROM "T" 
WHERE "T"."A" > %(A_1)s

My initial concern was about this part: B %%. compile(compile_kwargs={"literal_binds": True}) produces wrong SQL but literal_binds is not smth to use in prod, anyway:

SELECT "T"."A", "T"."B %%" 
FROM "T" 
WHERE "T"."A" > 50

The issue can be eliminated by SnowflakeDialect(paramstyle='named'):

SELECT "T"."A", "T"."B %" 
FROM "T" 
WHERE "T"."A" > :A_1

demidov91 avatar Jan 09 '23 18:01 demidov91

thanks @demidov91 for sharing the scripts.

as far as I'm concerned, the double percents behavior is controlled here in sqlalchemy's sql compiler: https://github.com/sqlalchemy/sqlalchemy/blob/main/lib/sqlalchemy/sql/compiler.py#L6615-L6617.

        self._double_percents = self.dialect.paramstyle in (
            "format",
            "pyformat",
        )

and then within the same file, it defines how to escape identifier: https://github.com/sqlalchemy/sqlalchemy/blob/main/lib/sqlalchemy/sql/compiler.py#L6661-L6671

    def _escape_identifier(self, value: str) -> str:
        """Escape an identifier.

        Subclasses should override this to provide database-dependent
        escaping behavior.
        """

        value = value.replace(self.escape_quote, self.escape_to_quote)
        if self._double_percents:
            value = value.replace("%", "%%")
        return value

I didn't yet see any code in the snowflake-sqlalchemy changing the compiler behavior around escaping identifier, so I'm wondering if this is the default behavior of the sqlalchemy compiler in the pyformat which is the default format of snowflake-sqlalchemy.

sfc-gh-aling avatar Jan 10 '23 17:01 sfc-gh-aling

@sfc-gh-aling After further investigation I can confirm that the behavior is totally fine in SQLAlchemy ecosystem:

This script will work as expected (using "T"."b %"):

with engine.connect() as con:
    table = sql.table(
        quoted_name('T', True),
        sql.column(quoted_name('b %', True)),
    )

    stmt = sql.select(table.c).limit(5)
    c_stmt = stmt.compile(dialect=SnowflakeDialect(), compile_kwargs={"literal_binds": True})

    res = con.execute(c_stmt)
    for line in res:
        print(line)

even though c_stmt.string produces this:

SELECT "T"."b %%" 
FROM "T"
 LIMIT 5

Sorry for taking your time.

demidov91 avatar Jan 12 '23 14:01 demidov91

no worries @demidov91 , glad you figured it out!

within the underlying connector, we have the logic to dealing with the double percent escaping so that parameters could be correctly bound.

I'm closing this issue now, but for anyone who has questions, please feel free to leave comments or create new issue.

sfc-gh-aling avatar Jan 12 '23 18:01 sfc-gh-aling