[Feature][core] Implement more type conversions for IcebergConversions
Search before asking
- [X] I searched in the issues and found nothing similar.
Motivation
In #3731 we've introduced IcebergCommitCallback to create Iceberg metadata after commit. We convert Paimon objects to Iceberg byte buffers in IcebergConversions, however it currently only supports a few types and does not support types like timestamp and time.
Solution
Support timestamp and time conversions in IcebergConversions.
Anything else?
No response
Are you willing to submit a PR?
- [ ] I'm willing to submit a PR!
I'm willing to submit a PR!
Do this according to the following specifications? time | Stores microseconds from midnight in an 8-byte little-endian long timestamp | Stores microseconds from 1970-01-01 00:00:00.000000 in an 8-byte little-endian long
then,like this:
case TIME: TIMESTAMP: return ByteBuffer.allocate(8) .order(ByteOrder.LITTLE_ENDIAN) .putLong(0, (long) value);
I'm willing to submit a PR!
Thanks for your interest. Please note that there are two types of timestamp in Paimon: TimestampType and LocalZonedTimestampType. TimestampType is timestamptz in Iceberg, while LocalZonedTimestampType is timestamp in Iceberg. You also need to be careful with the precesion of timestamp, because Iceberg has timestamp and timestamp_ns.
All in all you need to write tests carefully, especially for timestamp.
I want to be simple Whether there are similar code references? @tsreaper
I don't know that Object value is string ? or Timestamp ?
mabye Timestamp ? then change to buffer?
The Paimon object for TimestampType and LocalZonedTimestampType is org.apache.paimon.data.Timestamp. It stores millisecond and nanoOfMillisecond in the object.
For TimestampType, this millisecond is the time starting from 1970-01-01 00:00:00 UTC, so it is timestamptz in Iceberg. For LocalZonedTimestampType, this millisecond is the time starting from 1970-01-01 00:00:00 local timezone, so it is timestamp in Iceberg.
Also you need to check if precision <= 6, if yes it is timestamp or timestamptz; Otherwise if precision > 6 its timestamp_ns or timestamptz_ns.
Don't forget to add complete tests for each timestamp type and different precisions.
It seems a bit complicated
icerberg does not have local-timestamp-micros ,but paimon has local-timestamp-micros,
Maybe I need to learn more about paimon
Just give me a little time. I have to fix it
"local-timestamp-micros" is not a data type in Paimon. It is a data type in avro. Iceberg's avro reader has to deal with it. However for this issue, you only need to convert between Paimon types and Iceberg types, so you don't need to consider about avro types.
I have completed it, but nanoseconds cannot be tested, because the avro format in the test case cannot support precisions greater than 6
I don't know if my code idea is correct. Please review the code
@tsreaper Can I submit PR?
@tsreaper Can I submit PR?
Sure, feel free to submit.
@tsreaper Please help to review my pr, thank you very much
@tsreaper next step? With that done, I'm ready to continue contributing