VALUE_IS_OUT_OF_RANGE_OF_DATA_TYPE when reading iceberg table with a date column having a value greater than 3000-01-01
Describe the bug
We have an iceberg table with one of the columns having date type. One of the column values is 3022-02-01. Running the query results in:
Code: 321. DB::Exception: Received from localhost:9000. DB::Exception: Received from chi-swarm-1-cluster-1-2-0-0.chi-swarm-1-cluster-1-2-0.clickhouse.svc.cluster.local:9000. DB::Exception: Input value 384296 of a column "my_column" is out of allowed Date32 range, which is [-25567, 120530]: (in file/uri <redacted>/data/as_of_date=2022-03-26/00009-197881-2a15b26e-c6d8-48a5-91bb-f010afff782f-0-00001.parquet): While executing ParquetBlockInputFormat: While executing IcebergS3(_table_function.icebergCluster)ReadStep. (VALUE_IS_OUT_OF_RANGE_OF_DATA_TYPE)
To Reproduce Steps to reproduce the behavior:
- Create a parquet file with a date
3022-02-01 - Run
select * from s3(<path to file>) - See error
Expected behavior 3022-02-01 date is shown
Key information Provide relevant runtime details.
- Project Antalya Build Version: 25.6.5.20363.altinityantalya-alpine
- Cloud provider: AWS
- Kubernetes provider: EKS
- Object storage: AWS S3
ClickHouse maps the Iceberg DATE type to Clickhouse Date. The supported range in ClickHouse is [1970-01-01, 2149-06-06], so values like 3022-02-01 fall outside the allowed range and cannot be represented. https://clickhouse.com/docs/sql-reference/data-types/date
Thank you for the pointer, wondering if there's plan to support arbitrary dates? We do have quite a lot of dates prior to 1970 as well in our datasets.