iceberg icon indicating copy to clipboard operation
iceberg copied to clipboard

Kafka Connect: Add table to topics mapping property

Open igorvoltaic opened this issue 1 year ago • 2 comments

The property which allows mapping Kafka topics to Iceberg tables:

An example config would look like: iceberg.tables.topic-to-table-mapping=some_topic0:table_name0,some_topic1:table_name1

Similar approach implemented in SnowflakeSink, ClickhouseSink, Aiven JdbcSink. Probably I can call it a standard way of static routing data in sink connectors at the moment. The reason I stated thinking of implementing this because it isn't obvious from the config (or readme) how one should map the topics to tables in the original version because there is no clear indication of where the .route-regex is applied.

This could be done for .partition-by and .id-columns configs as well, but using tables as map keys is such case.

I am making the same PR as the one in the https://github.com/tabular-io/iceberg-kafka-connect/pull/223 because I was told that it is being moved to this core repository. It seems that the code hasn't been fully migrated to this core repository yet and I am aware of there should be further tasks such as adding the rest of functionality from the above PR into IcebergSinkTask (as I see it), but would like to share the idea and get initial feedback. Thanks!

igorvoltaic avatar Jun 02 '24 08:06 igorvoltaic

I believe when we add https://github.com/apache/iceberg/pull/11313 you should be able to accomplish mapping topics to tables. Also I think this PR isn't complete, the new config isn't being used.

bryanck avatar Oct 16 '24 18:10 bryanck

I believe when we add https://github.com/apache/iceberg/pull/11313 you should be able to accomplish mapping topics to tables. Also I think this PR isn't complete, the new config isn't being used.

I believe they aren't related, since that PR covers dynamic routing

igorvoltaic avatar Oct 20 '24 16:10 igorvoltaic

This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time and @mention a reviewer or discuss it on the [email protected] list. Thank you for your contributions.

github-actions[bot] avatar Nov 26 '24 00:11 github-actions[bot]

@bryanck ping! 😄

igorvoltaic avatar Nov 26 '24 03:11 igorvoltaic

This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time and @mention a reviewer or discuss it on the [email protected] list. Thank you for your contributions.

github-actions[bot] avatar Dec 28 '24 00:12 github-actions[bot]

Ping

igorvoltaic avatar Dec 28 '24 05:12 igorvoltaic

This solution seems best to me; most explicit and least restrictive. I'll be using this in my fork. thanks @igorvoltaic. I also agree this isn't related to #11313 which is for dynamic routing.

yornstei avatar Jan 07 '25 04:01 yornstei

Thanks @yornstei, good to know you found this useful. I was wondering if you had an opinion on https://github.com/apache/iceberg/pull/11623?

bryanck avatar Jan 07 '25 04:01 bryanck

Thanks @yornstei, good to know you found this useful. I was wondering if you had an opinion on #11623?

I took a look at that one too. From a configurable perspective, it seems better and the pattern aligns more with the connector's other configs of specifying the route-regex on each table. From a code perspective, there's quite a bit more going on there which I'm hesitant to merge/maintain in my fork.

yornstei avatar Jan 07 '25 04:01 yornstei

Thanks, I'll take a deeper look next week.

bryanck avatar Jan 07 '25 15:01 bryanck

This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time and @mention a reviewer or discuss it on the [email protected] list. Thank you for your contributions.

github-actions[bot] avatar Feb 07 '25 00:02 github-actions[bot]

Ping

igorvoltaic avatar Feb 09 '25 05:02 igorvoltaic

This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time and @mention a reviewer or discuss it on the [email protected] list. Thank you for your contributions.

github-actions[bot] avatar Mar 12 '25 00:03 github-actions[bot]

This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If you think that is incorrect, or the pull request requires review, you can revive the PR at any time.

github-actions[bot] avatar Mar 19 '25 00:03 github-actions[bot]