dql icon indicating copy to clipboard operation
dql copied to clipboard

Storing query results as JSON does not output valid JSON

Open martynassateika opened this issue 7 years ago • 2 comments

When storing query results as CSV, DQL prints the column names and then the entries on subsequent rows. It's enough to view the file in a CSV editor.

With JSON though, the output cannot be parsed as such. In order to do it I currently have to add commas after each line in the output (but not the last!), prepend the file with a [ and append a ].

The result of a SELECT/SCAN is a collection of rows, so it would be better to wrap the output in an array:

Meaning the output of SCAN foo, bar FROM table SAVE out.json could be similar to:

[
  { "foo": 1, "bar": 2},
  { "foo": 3, "bar": 4},
]

instead of

{ "foo": 1, "bar": 2}
{ "foo": 3, "bar": 4}

I could help implement this if you think it's worthwhile @stevearc. Though this would be a breaking change, unless we allow the behavior to be altered using a flag.

Let me know what you think!

martynassateika avatar Mar 01 '19 21:03 martynassateika

The reason for the current JSON format is that by having a single JSON object per-line, it makes streaming data much easier. If you were to use a full JSON parser/serializer, you have to wait for all the data to be loaded into memory before you can write it all out. Logging software frequently uses this format for the same reasons.

So I don't want to get rid of the current format, but if you'd like to add a flag, or perhaps a config option, that would enable a "strict" JSON input/output I'd review the PR! I would recommend trying to make the load operation detect & handle either format.

stevearc avatar Mar 05 '19 06:03 stevearc

Thanks for the explanation Steven. Makes sense, I figured it might be related to line-by-line processing somehow :-)

Ok I'll set up the test environment and give this a go.

martynassateika avatar Mar 05 '19 10:03 martynassateika