Per-record basis schema registry subject name for writing to Kafka
Spark allows creating a column named topic to define a per-record basis topic.
The destination topic for the records of the DataFrame can either be specified statically as an option to the DataStreamWriter or on a per-record basis as a column named “topic” in the DataFrame. (Source)
I need to use the same strategy for the schema registry subject name. Does ABRiS support it?
Thanks!
No, currently you must specify the schema when you are creating the from_avro / to_avro expression. The Only exception to that is the schema that is downloaded using the confluent id at the start of the message.
I am not sure if I understand the use case. All the rows in the data frame must have the same spark type right? Shouldn't they have the same avro type/schema as well?
You got it. The rows have the same schema. However, some records should be published for their specific topic, hence, distinct subject names are used (internal reasons).
As I said, the topic per-record basis is supported by Spark (Source).
I understand, but sorry that is not supported.
No problem. Indeed, this issue can be converted to future improvements. Thanks @cerveada. :smiley: