Formatting does not support `\.` end-of-data delimiter in postgresql COPY statement
Describe the bug
-
Create a PostgreSQL dialect .sql file with a COPY statement containing the end-of-data delimiter (
\., backslash dot / period) (see documentation)copy table (column) from stdin; foo \.
Expected behavior The parser handles the statement appropriately or ignores it without crashing. It should complete parsing and formatting the contents of the file.
Actual behavior The parser crashes with the following error:
Parse error: Unexpected "\." at line 3 column 1
Please report this at [Github issues page.](https://github.com/sql-formatter-org/sql-formatter/issues)
Stack Trace:
createParseError@https://unpkg.com/sql-formatter@latest/dist/sql-formatter.min.js:1:191491
tokenize@https://unpkg.com/sql-formatter@latest/dist/sql-formatter.min.js:1:191277
tokenize@https://unpkg.com/sql-formatter@latest/dist/sql-formatter.min.js:1:192667
parse/R<@https://unpkg.com/sql-formatter@latest/dist/sql-formatter.min.js:1:225103
reset@https://unpkg.com/sql-formatter@latest/dist/sql-formatter.min.js:1:197635
654/T/I.prototype.feed@https://unpkg.com/sql-formatter@latest/dist/sql-formatter.min.js:1:4085
parse@https://unpkg.com/sql-formatter@latest/dist/sql-formatter.min.js:1:225258
parse@https://unpkg.com/sql-formatter@latest/dist/sql-formatter.min.js:1:225453
format@https://unpkg.com/sql-formatter@latest/dist/sql-formatter.min.js:1:224966
rA@https://unpkg.com/sql-formatter@latest/dist/sql-formatter.min.js:1:229754
tA@https://unpkg.com/sql-formatter@latest/dist/sql-formatter.min.js:1:228213
format@https://sql-formatter-org.github.io/sql-formatter/index.js:48:31
Usage
- How are you calling / using the library?
- I want to use the library to normalize the formatting of pg_dump output so I can programmatically compare output for differences in a CI pipeline. The pg_dump documentation strongly recommends using
COPY FROMinstead ofINSERTstatements due to performance. Since we're also using diffs on the dump as a way to visualize changes to the database schema during code review, we also prefer theCOPY FROMsyntax since it is somewhat more readable than theINSERTsyntax generated by pg_dump.
- I want to use the library to normalize the formatting of pg_dump output so I can programmatically compare output for differences in a CI pipeline. The pg_dump documentation strongly recommends using
- What SQL language(s) does this apply to?
- postgresql
- Which SQL Formatter version are you using?
- 12.2.0
Thanks for reporting.
Unfortunately I think there isn't much of a hope at fixing this. The fundamental problem is that sql-formatter doesn't really properly parse the SQL. It only tokenizes it and then looks at some patterns in the tokens. This copy table syntax would be tricky to handle even inside a proper parser - one needs to recognize that one is copying from STDIN and then switch the parsing to the chosen file format.
I have started writing a new formatting library, but it doesn't yet support postgres, and to add such a support I will need to first start implementing a postgres parser, which I haven't yet started... as it's kind of a big undertaking...
Makes sense, thanks for your quick response and for your work on the library!
Since 15.1.0 there's a workaround to avoid the formatter crashing for this sort of code:
/* sql-formatter-disable */
copy table (column) from stdin;
foo
\.
/* sql-formatter-enable */