ABRiS icon indicating copy to clipboard operation
ABRiS copied to clipboard

Per-record basis schema registry subject name for writing to Kafka

Open brunolaporais opened this issue 5 years ago • 5 comments

Spark allows creating a column named topic to define a per-record basis topic.

The destination topic for the records of the DataFrame can either be specified statically as an option to the DataStreamWriter or on a per-record basis as a column named “topic” in the DataFrame. (Source)

I need to use the same strategy for the schema registry subject name. Does ABRiS support it?

Thanks!

brunolaporais avatar Feb 18 '21 19:02 brunolaporais

No, currently you must specify the schema when you are creating the from_avro / to_avro expression. The Only exception to that is the schema that is downloaded using the confluent id at the start of the message.

cerveada avatar Feb 19 '21 07:02 cerveada

I am not sure if I understand the use case. All the rows in the data frame must have the same spark type right? Shouldn't they have the same avro type/schema as well?

cerveada avatar Feb 19 '21 07:02 cerveada

You got it. The rows have the same schema. However, some records should be published for their specific topic, hence, distinct subject names are used (internal reasons).

As I said, the topic per-record basis is supported by Spark (Source).

brunolaporais avatar Feb 19 '21 14:02 brunolaporais

I understand, but sorry that is not supported.

cerveada avatar Feb 22 '21 08:02 cerveada

No problem. Indeed, this issue can be converted to future improvements. Thanks @cerveada. :smiley:

brunolaporais avatar Feb 24 '21 12:02 brunolaporais