Kafka Connect: Add table to topics mapping property
The property which allows mapping Kafka topics to Iceberg tables:
An example config would look like:
iceberg.tables.topic-to-table-mapping=some_topic0:table_name0,some_topic1:table_name1
Similar approach implemented in SnowflakeSink, ClickhouseSink, Aiven JdbcSink. Probably I can call it a standard way of static routing data in sink connectors at the moment.
The reason I stated thinking of implementing this because it isn't obvious from the config (or readme) how one should map the topics to tables in the original version because there is no clear indication of where the .route-regex is applied.
This could be done for .partition-by and .id-columns configs as well, but using tables as map keys is such case.
I am making the same PR as the one in the https://github.com/tabular-io/iceberg-kafka-connect/pull/223 because I was told that it is being moved to this core repository. It seems that the code hasn't been fully migrated to this core repository yet and I am aware of there should be further tasks such as adding the rest of functionality from the above PR into IcebergSinkTask (as I see it), but would like to share the idea and get initial feedback. Thanks!
I believe when we add https://github.com/apache/iceberg/pull/11313 you should be able to accomplish mapping topics to tables. Also I think this PR isn't complete, the new config isn't being used.
I believe when we add https://github.com/apache/iceberg/pull/11313 you should be able to accomplish mapping topics to tables. Also I think this PR isn't complete, the new config isn't being used.
I believe they aren't related, since that PR covers dynamic routing
This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time and @mention a reviewer or discuss it on the [email protected] list. Thank you for your contributions.
@bryanck ping! 😄
This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time and @mention a reviewer or discuss it on the [email protected] list. Thank you for your contributions.
Ping
This solution seems best to me; most explicit and least restrictive. I'll be using this in my fork. thanks @igorvoltaic. I also agree this isn't related to #11313 which is for dynamic routing.
Thanks @yornstei, good to know you found this useful. I was wondering if you had an opinion on https://github.com/apache/iceberg/pull/11623?
Thanks @yornstei, good to know you found this useful. I was wondering if you had an opinion on #11623?
I took a look at that one too. From a configurable perspective, it seems better and the pattern aligns more with the connector's other configs of specifying the route-regex on each table. From a code perspective, there's quite a bit more going on there which I'm hesitant to merge/maintain in my fork.
Thanks, I'll take a deeper look next week.
This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time and @mention a reviewer or discuss it on the [email protected] list. Thank you for your contributions.
Ping
This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time and @mention a reviewer or discuss it on the [email protected] list. Thank you for your contributions.
This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If you think that is incorrect, or the pull request requires review, you can revive the PR at any time.