Refactor ClickHouseStatement interface for data insertion
Goal: reduce number of public proprietary methods, exposed by ru.yandex.clickhouse.ClickHouseStatement
Current situation: interface exposes many methods for data manipulations, that differs by:
- Input format
- Additional configuration params
Proposal: create single method, returning builder for data manipulations. Design is inspired by Spark DataFrameWriter/DataFrameReader. https://spark.apache.org/docs/2.3.0/api/java/org/apache/spark/sql/DataFrameWriter.html
Example of all the options, available for configuration:
ClickHouseStatement sth;
sth
.write()
.withDbParams(Map<ClickHouseQueryParam, String> dbParams) // optional
.withExternalData(List<ClickHouseExternalData> data) // optional
.format(ClickHouseFormat.CSV)
.input(new FileInputStream("filename")
.table("my_table")
// or specify SQL
// .sql("INSERT INTO my_table (X,Y,Z) VALUES")
.send(); // terminal operation, performs data insertion
For operation of binary formats, requiring callback:
sth.write().send("INSERT INTO my_table (x) VALUES ", new ClickHouseStreamCallback() {
@Override
public void writeTo(ClickHouseRowBinaryStream stream) throws IOException {
}
}, RowBinary);
Possible shortcuts for sending the data:
sth
.write()
.send("INSERT INTO my_table VALUES", InputStream stream, ClickHouseFormat.CSV);
sth
.write()
.sendToTable("my_table", InputStream stream, ClickHouseFormat.CSV)
@den-crane, @filimonov, you are inexhaustible source of ideas - please, expose your opinion.
I like this
.write()
.send("INSERT INTO my_table VALUES", InputStream stream, ClickHouseFormat.CSV);
it should support : --format_csv_delimiter=";" --query="INSERT INTO test_table_log FORMAT CSVWithNames"
.write()
.withDbParams((new dbParams).add(format_csv_delimiter,";" ).add () )
.input(new FileInputStream("filename")
.sql("INSERT INTO my_table (X,Y,Z) FORMAT CSVWithNames")
.send();