User-defined functions and table-valued functions have an extra space
It appears there is an internal list of known SQLite functions and virtual tables used to format. I rely heavily on user-defined functions and table-valued functions. These custom functions have an extra space before the opening parenthesis.
Input data
SELECT
foo('unknown'),
substr('known'),
bar('unknown')
FROM
foo_bar(foo('unknown')),
json_each()
Expected Output
SELECT
foo('unknown'),
substr('known'),
bar('unknown')
FROM
foo_bar(foo('unknown')),
json_each()
Actual Output
SELECT
foo ('unknown'), -- Extra space
substr('known'),
bar ('unknown') -- Extra space
FROM
foo_bar (foo ('unknown')), -- Two extra spaces
json_each()
Usage
=> https://sql-formatter-org.github.io/sql-formatter/ with SQLite language
I don't think any of the other formatting options matter in this case. The formatter should not need to keep a list of SQLite functions/tables but infer the formatting from the token (I have no idea how this library is actually implemented).
Yes, this is a known issue. We're planning to address this, but it's not a simple fix to make. Hard to say how long it will take.
The formatter relies on various heuristics to do it's work. We're slowly evolving towards properly parsing the SQL language, but we're still far from that.
Thanks for the quick reply. I'm sure this is a tricky problem and you likely don't want to ship like 58MB of grammars (and if so, allow tree shaking and only importing the grammar I need) https://www.npmjs.com/package/antlr4ts-sql

For my use case I wonder if I could interface to the SQLite C API to tokenize the SQL, but it looks there is no high level API available.
Hey,
Maybe any update on this one behavior? Or maybe can you suggest any workarounds?
For SQLite I can suggest a new SQL formatting library that I've been developing: prettier-plugin-sql-cst
Edit: The problem I described below is resolved in [email protected] and later with the postgresql language/dialect as mentioned by @nwalters512
Yeah this is also happening with PostgreSQL, the now() function has an extra space between the function name and the parentheses:
Input:
SELECT
users.id,
users.username
FROM
users
INNER JOIN sessions ON (
sessions.token = 'abc123'
AND users.id = sessions.user_id
AND sessions.expiry_timestamp > now()
)
Expected Output (no change):
SELECT
users.id,
users.username
FROM
users
INNER JOIN sessions ON (
sessions.token = 'abc123'
AND users.id = sessions.user_id
AND sessions.expiry_timestamp > now()
)
Actual Output (extra space between now and ()):
SELECT
users.id,
users.username
FROM
users
INNER JOIN sessions ON (
sessions.token = 'abc123'
AND users.id = sessions.user_id
AND sessions.expiry_timestamp > now ()
)
@karlhorky did you set language: 'postgresql' in your config? NOW might not necessary be recognized as a function if you haven't specified the language. For reference, the NOW function is handled here:
https://github.com/sql-formatter-org/sql-formatter/blob/51547457a8488092b2e81b536be34dfacbcf5d81/src/languages/postgresql/postgresql.functions.ts#L188
I believe I do have it configured, yes:
https://github.com/upleveled/eslint-config-upleveled/blob/0f994bc92e1c5731d830a6f738b21b0f62c4fdef/templates/next-js-postgresql/prettier.config.mjs#L15
Encountering the same issue, can we consider exhaustively listing scenarios where spaces are needed before parentheses for handling? In all other cases, remove spaces in front of parentheses.
For example, keywords like:
where
when
and
join
as
+
-
*
/
(
Also seeing the same issue with postgresql:
Input
create or replace function find_array_element(el anyelement, arr anyarray) returns integer as $$ declare i int;
begin
for i in 1..array_upper( arr, 1 ) loop
if( el = arr[i] ) then
return i;
end if;
end loop;
return 0;
end;
$$ language plpgsql;
output
create
or replace function find_array_element (el anyelement, arr anyarray) returns integer as $$
declare
i int;
begin
for i in 1..array_upper( arr, 1 ) loop
if( el = arr[i] ) then
return i;
end if;
end loop;
return 0;
end;
$$ language plpgsql;
expected
create
or replace function find_array_element(el anyelement, arr anyarray) returns integer as $$
declare
i int;
begin
for i in 1..array_upper( arr, 1 ) loop
if( el = arr[i] ) then
return i;
end if;
end loop;
return 0;
end;
$$ language plpgsql;
prettier config:
{
"trailingComma": "es5",
"proseWrap": "always",
"plugins": [
"prettier-plugin-sql"
],
"language": "postgresql",
"paramTypes": "{'named':[':']}"
}