clickhouse-java icon indicating copy to clipboard operation
clickhouse-java copied to clipboard

Cannot write AggregateFunction(groupBitmap, UInt64) via JDBC to spark

Open hotstar-xia opened this issue 8 months ago • 1 comments

Description

Hi ClickHouse team, I'm encountering an issue when trying to write to a column of type AggregateFunction(groupBitmap, UInt64) via Spark using clickhouse-jdbc and clickhouse-data libraries. This used to work in older versions of the driver, but no longer does. Details can be found here: https://clickhousedb.slack.com/archives/CU478UEQZ/p1747041808183739 Details log and code files: https://gist.github.com/hotstar-xia/e90cd64443ac5deadaf69107978295d4

Steps to reproduce

  1. Create the target table
  2. Use Spark JDBC to write a row into this table. The uv column is constructed via serialized bitmap (using Roaring64NavigableMap + ClickHouseBitmap wrapper).

Error Log or Exception StackTrace

Job aborted due to stage failure: Only singleton array is allowed, but we got: [0, 3, 2, 0, 0, 0, 0, 0, 0, 0, 4, 0, 0, 0, 0, 0, 0, 0, 7, 0, 0, 0, 0, 0, 0, 0]
Caused by: IllegalArgumentException: Only singleton array is allowed, but we got: [0, 3, 2, 0, 0, 0, 0, 0, 0, 0, 4, 0, 0, 0, 0, 0, 0, 0, 7, 0, 0, 0, 0, 0, 0, 0]
    at com.clickhouse.data.ClickHouseValue.update(ClickHouseValue.java:782)
    at com.clickhouse.jdbc.internal.InputBasedPreparedStatement.setBytes(InputBasedPreparedStatement.java:286)
...

Expected Behaviour

Writing bitmap to the AggregateFunction(groupBitmap, UInt64) column should succeed, as it did in prior driver versions.

Code Example

df.write
    .format("jdbc")
    .mode("append")
    .option("driver", "com.clickhouse.jdbc.ClickHouseDriver")
    .option("url", "jdbc:clickhouse://ip-10-10-240-59:8123")
    .option("user", "default")
    .option("password", "")
    .option("dbtable", "label_tmp_table")
    .option("batchsize", 10000)
    .option("isolationLevel", "NONE")
    // .option("createTableOptions", "ENGINE = AggregatingMergeTree ORDER BY (label_table, label_name, value_type, value_op)")
    .option("truncate", "true")
    .save()

Configuration

Client Configuration

  • ClickHouse version: 25.4.3.22

Environment

  • clickhouse-jdbc: 0.6.0-patch3
  • clickhouse-data: 0.8.5
  • Spark version: Databricks Runtime 12.2 LTS (Apache Spark 3.3.2, Scala 2.12)
  • Cluster type: Single-node
  • Target column: AggregateFunction(groupBitmap, UInt64)

ClickHouse Server

  • ClickHouse version: 25.4.3.22

  • CREATE TABLE statements for tables involved:

CREATE TABLE default.label_tmp_table
(
    `label_table` String,
    `label_name` String,
    `value_type` String,
    `value_float64` Float64,
    `value_int64` UInt64,
    `value_string` String,
    `value_datetime` DateTime,
    `value_boolean` Bool,
    `value_op` String,
    `uv` AggregateFunction(groupBitmap, UInt64)
)
ENGINE = AggregatingMergeTree
ORDER BY (label_table, label_name, value_type, value_op)

hotstar-xia avatar May 15 '25 09:05 hotstar-xia

depends on https://github.com/ClickHouse/clickhouse-java/issues/2599

mshustov avatar Sep 27 '25 12:09 mshustov