Can you support csv format as input
Hi. when I want to Full-text search on Clickhouse,the source file's format insert into clickhouse is csv. The reason why I choose to use csv files to insert clickhouse is that csv files take up less space than json files, and the insertion speed is faster, and the speed of generating csv files is also faster. If I convert the csv file to json file, It's too slow.
I want to you can support the ingest method like
cat *.csv | ./quickwit index ingest --input-format csv --index gh-archive
OR
cat *.csv | ./quickwit index ingest --index gh-archive
Looking forward to your reply, thank you
Hey, I'm also looking for csv support. It could be awesome to allow custom separator.
you can use VRL to feed quickwit with csv, and transform it
# Your source config here
# ...
input_format: plain_text
transform:
script: |
# csv looks like: "123;abc;def"
parsed_csv = parse_csv!(.plain_text, ";")
.my_field1 = to_int!(parsed_csv[0])
.my_field2 = parsed_csv[1]
.my_field3 = parsed_csv[2]
.original_csv = .plain_text
del(.plain_text)
currently, this can be made to work with the ingest API, but isn't very user friendly. The ingest api is a datasource that's created automatically, and there is no provided way to edit it. Modifying manually the config stored in the metastore do work.