SNOW-353837: Percent signs in query results doubled
Please answer these questions before submitting your issue. Thanks!
-
What version of Python are you using (
python --version)? -- 3.8 -
What operating system and processor architecture are you using (
python -c 'import platform; print(platform.platform())')? --Linux-5.8.0-50-generic-x86_64-with-glibc2.29 -
What are the component versions in the environment (
pip freeze)?
asn1crypto==1.4.0
attrs==20.3.0
Authlib==0.15.3
azure-common==1.1.25
azure-core==1.8.2
azure-storage-blob==12.5.0
boto3==1.15.18
botocore==1.18.18
certifi==2020.6.20
cffi==1.14.3
chardet==3.0.4
click==7.1.2
coverage==5.5
cryptography==2.9.2
fastapi==0.63.0
greenlet==1.0.0
h11==0.12.0
idna==2.10
iniconfig==1.1.1
isodate==0.6.0
jmespath==0.10.0
msrest==0.6.19
numpy==1.20.2
oauthlib==3.1.0
oscrypto==1.2.1
packaging==20.9
pandas==1.1.5
pkg-resources==0.0.0
pluggy==0.13.1
py==1.10.0
pyarrow==3.0.0
pycparser==2.20
pycryptodomex==3.9.8
pydantic==1.8.1
PyJWT==1.7.1
pyOpenSSL==19.1.0
pyparsing==2.4.7
pytest==6.2.3
python-dateutil==2.8.1
pytz==2020.1
requests==2.25.1
requests-oauthlib==1.3.0
s3transfer==0.3.3
six==1.15.0
snowflake-connector-python==2.4.2
snowflake-sqlalchemy==1.2.4
SQLAlchemy==1.4.7
starlette==0.13.6
toml==0.10.2
typing-extensions==3.7.4.3
urllib3==1.25.11
uvicorn==0.13.4
-
What did you do? Use SQLAlchemy ORM
selectto retrieve an entity like this:session.execute(select(MyClass))The entity has some string fields that, among others, contain percent characters. -
What did you expect to see? The exact string values the way they are stored in the db
-
What did you see instead? Every percent sign doubled. This is probably due to the class
SQLAlchemyIdentifierPreparernot overriding the method_escape_identifierin its base classIdentifierPreparerand therefore doubling percent signs.
@tobias-feil-by We will take a look at this.
Just bumped into this issue, a fix would be appreciated
hey guys, thanks for your patience -- I'd like to give an update on the issue:
I'm able to repro the issue and this should be an issue in the snowflake-sqlalchemy. we had a PR out for better sqlalchemy compliance which should address this issue: https://github.com/snowflakedb/snowflake-sqlalchemy/pull/314
I will let you know once the new version of snowflake-sqlalchemy is released.
@sfc-gh-aling The issue still exists in snowflake-sqlalchemy==1.4.5
hey @demidov91 , thanks for reaching out, do you mind sharing your test script with me so that I could reproduce the debug?
@sfc-gh-aling Thank you for a really quick response. After some more investigation I would say it may be not a bug. Here is what I mean:
t = sql.table('T', sql.column('A'), sql.column('B %'))
compiled = t.select().where(t.c['A'] > 50).compile(dialect=SnowflakeDialect())
print(compiled.string)
Output:
SELECT "T"."A", "T"."B %%"
FROM "T"
WHERE "T"."A" > %(A_1)s
My initial concern was about this part: B %%. compile(compile_kwargs={"literal_binds": True}) produces wrong SQL but literal_binds is not smth to use in prod, anyway:
SELECT "T"."A", "T"."B %%"
FROM "T"
WHERE "T"."A" > 50
The issue can be eliminated by SnowflakeDialect(paramstyle='named'):
SELECT "T"."A", "T"."B %"
FROM "T"
WHERE "T"."A" > :A_1
thanks @demidov91 for sharing the scripts.
as far as I'm concerned, the double percents behavior is controlled here in sqlalchemy's sql compiler: https://github.com/sqlalchemy/sqlalchemy/blob/main/lib/sqlalchemy/sql/compiler.py#L6615-L6617.
self._double_percents = self.dialect.paramstyle in (
"format",
"pyformat",
)
and then within the same file, it defines how to escape identifier: https://github.com/sqlalchemy/sqlalchemy/blob/main/lib/sqlalchemy/sql/compiler.py#L6661-L6671
def _escape_identifier(self, value: str) -> str:
"""Escape an identifier.
Subclasses should override this to provide database-dependent
escaping behavior.
"""
value = value.replace(self.escape_quote, self.escape_to_quote)
if self._double_percents:
value = value.replace("%", "%%")
return value
I didn't yet see any code in the snowflake-sqlalchemy changing the compiler behavior around escaping identifier, so I'm wondering if this is the default behavior of the sqlalchemy compiler in the pyformat which is the default format of snowflake-sqlalchemy.
@sfc-gh-aling After further investigation I can confirm that the behavior is totally fine in SQLAlchemy ecosystem:
This script will work as expected (using "T"."b %"):
with engine.connect() as con:
table = sql.table(
quoted_name('T', True),
sql.column(quoted_name('b %', True)),
)
stmt = sql.select(table.c).limit(5)
c_stmt = stmt.compile(dialect=SnowflakeDialect(), compile_kwargs={"literal_binds": True})
res = con.execute(c_stmt)
for line in res:
print(line)
even though c_stmt.string produces this:
SELECT "T"."b %%"
FROM "T"
LIMIT 5
Sorry for taking your time.
no worries @demidov91 , glad you figured it out!
within the underlying connector, we have the logic to dealing with the double percent escaping so that parameters could be correctly bound.
I'm closing this issue now, but for anyone who has questions, please feel free to leave comments or create new issue.