sql-formatter icon indicating copy to clipboard operation
sql-formatter copied to clipboard

Formatting does not support `\.` end-of-data delimiter in postgresql COPY statement

Open lediur opened this issue 2 years ago • 3 comments

Describe the bug

  • Create a PostgreSQL dialect .sql file with a COPY statement containing the end-of-data delimiter (\., backslash dot / period) (see documentation)

    copy table (column) from stdin;
    foo
    \.
    

Expected behavior The parser handles the statement appropriately or ignores it without crashing. It should complete parsing and formatting the contents of the file.

Actual behavior The parser crashes with the following error:

Parse error: Unexpected "\." at line 3 column 1

Please report this at [Github issues page.](https://github.com/sql-formatter-org/sql-formatter/issues)

Stack Trace:

createParseError@https://unpkg.com/sql-formatter@latest/dist/sql-formatter.min.js:1:191491
tokenize@https://unpkg.com/sql-formatter@latest/dist/sql-formatter.min.js:1:191277
tokenize@https://unpkg.com/sql-formatter@latest/dist/sql-formatter.min.js:1:192667
parse/R<@https://unpkg.com/sql-formatter@latest/dist/sql-formatter.min.js:1:225103
reset@https://unpkg.com/sql-formatter@latest/dist/sql-formatter.min.js:1:197635
654/T/I.prototype.feed@https://unpkg.com/sql-formatter@latest/dist/sql-formatter.min.js:1:4085
parse@https://unpkg.com/sql-formatter@latest/dist/sql-formatter.min.js:1:225258
parse@https://unpkg.com/sql-formatter@latest/dist/sql-formatter.min.js:1:225453
format@https://unpkg.com/sql-formatter@latest/dist/sql-formatter.min.js:1:224966
rA@https://unpkg.com/sql-formatter@latest/dist/sql-formatter.min.js:1:229754
tA@https://unpkg.com/sql-formatter@latest/dist/sql-formatter.min.js:1:228213
format@https://sql-formatter-org.github.io/sql-formatter/index.js:48:31

Usage

  • How are you calling / using the library?
    • I want to use the library to normalize the formatting of pg_dump output so I can programmatically compare output for differences in a CI pipeline. The pg_dump documentation strongly recommends using COPY FROM instead of INSERT statements due to performance. Since we're also using diffs on the dump as a way to visualize changes to the database schema during code review, we also prefer the COPY FROM syntax since it is somewhat more readable than the INSERT syntax generated by pg_dump.
  • What SQL language(s) does this apply to?
    • postgresql
  • Which SQL Formatter version are you using?
    • 12.2.0

lediur avatar Mar 22 '23 18:03 lediur

Thanks for reporting.

Unfortunately I think there isn't much of a hope at fixing this. The fundamental problem is that sql-formatter doesn't really properly parse the SQL. It only tokenizes it and then looks at some patterns in the tokens. This copy table syntax would be tricky to handle even inside a proper parser - one needs to recognize that one is copying from STDIN and then switch the parsing to the chosen file format.

I have started writing a new formatting library, but it doesn't yet support postgres, and to add such a support I will need to first start implementing a postgres parser, which I haven't yet started... as it's kind of a big undertaking...

nene avatar Mar 22 '23 20:03 nene

Makes sense, thanks for your quick response and for your work on the library!

lediur avatar Mar 22 '23 20:03 lediur

Since 15.1.0 there's a workaround to avoid the formatter crashing for this sort of code:

/* sql-formatter-disable */
copy table (column) from stdin;
foo
\.
/* sql-formatter-enable */

nene avatar Jan 22 '24 12:01 nene