sling-cli icon indicating copy to clipboard operation
sling-cli copied to clipboard

By default, a stream's `primary_key` should be used for target_options table_keys primary

Open martinburch opened this issue 9 months ago • 3 comments

It is redundant to specify

streams:
  schemaName.tableName:
    primary_key: Id
    target_options:
      table_keys:
        primary: [Id]

However, it seems without the table key being specified, the table is created as a heap, with no primary key defined.

martinburch avatar Apr 29 '25 14:04 martinburch

hey @martinburch, yes, this is something that was debated when implementing table_keys, since Sling originally was designed with the primary_key input to simply use it to merge/upsert in incremental mode, without creating a primary key in the target table. So specifying it in the table keys came later to explicitly instruct sling to create a primary key in the DDL.

Will revisit. I remember facing some issue in some database, when merging into a table with a composite primary key in the DDL, so at the time opted to not change the behavior.

flarco avatar Apr 29 '25 14:04 flarco

If implemented, would be good to have two additional configuration patterns

Explicitly specify no primary key in target table

For example, the target may not support primary keys or you might not want one

streams:
  schemaName.tableName:
    primary_key: Id
    target_options:
      table_keys:
        primary: null OR ~ OR '' etc.

Specify single-column primary key as string

Rather than requiring a list

streams:
  schemaName.tableName:
    primary_key: Id
    target_options:
      table_keys:
        primary: Id (rather than [Id])

martinburch avatar Apr 30 '25 07:04 martinburch

@martinburch nice, thanks, that might just work 👍 .

flarco avatar Apr 30 '25 10:04 flarco