RClickhouse icon indicating copy to clipboard operation
RClickhouse copied to clipboard

External Data support

Open kdkavanagh opened this issue 5 years ago • 3 comments

Apologies if I missed this somewhere in the docs... Are there any plans to support Clickhouse's External Data API, perhaps accepting a data.frame as input to the query?

kdkavanagh avatar Aug 27 '20 17:08 kdkavanagh

Hello!

thanks for the issue. Currently there are no concrete plans for the very near future. I would love to learn more about your use-case. May I ask what advantages such implementation would have for you over the following input-method which is currently possible in RClickhouse?

library(RClickhouse)
library(DBI)

con <- dbConnect(clickhouse(), port=9000)

dataFrame <- data.frame(
  "Col1"=c("b","b"),
  "Col2"=1:2
)
dbWriteTable(con, "dataFrameTable", dataFrame)

Yours Tridelt

tridelt avatar Aug 27 '20 19:08 tridelt

The workaround you suggest would work, though pushes the management of that ephemeral table onto the user, which I suspect would be prone to mistakes.

Main usecase is that I often have a set of identifiers in R for which I would want to join against some data living in a clickhouse table. Right now, I would either need to convert those identifiers to a (very long) WHERE id in ({x}) string for the query, or pull all the data from Clickhouse into R and do the join/lookup/merge in R which is likely to exceed reasonable memory limits for large clickhouse tables.

One of the big python Clickhouse drivers has implemented support for the external data API: https://clickhouse-driver.readthedocs.io/en/latest/features.html#external-data-for-query-processing

kdkavanagh avatar Sep 11 '20 19:09 kdkavanagh

Hi! Thanks for this nice suggestion! The external data API seems indeed really interesting and there are for sure plenty of use cases. However, this package is basically a wrapper around the official clickhouse c++ client plus some dplyr gimmicks. As far as I know, this feature is not supported by the cpp client yet and therefore we have to add it there first. We'll discuss it internally the next days and reach out to the cpp-client fellas. Please don't expect it to happen within the next weeks, but we'll keep this thread open and use it for updates.

inkrement avatar Sep 12 '20 12:09 inkrement