AVRO-2872: Fix bug in SpecificCompiler which prevents Logical Types being registered for union fields
Make sure you have checked all steps below.
Jira
- [x] My PR addresses the following Avro Jira issues and references them in the PR title.
- https://issues.apache.org/jira/browse/AVRO-2872
- In case you are adding a dependency, check if the license complies with the ASF 3rd Party License Policy.
Tests
- [x] My PR adds the following unit tests OR does not need testing for this extremely good reason:
TestSpecificCompiler#testLogicalTypeUnionProducesConversionField
Commits
- [x] My commits all reference Jira issues in their subject lines. In addition, my commits follow the guidelines from "How to write a good git commit message":
- Subject is separated from body by a blank line
- Subject is limited to 50 characters (not including Jira issue reference)
- Subject does not end with a period
- Subject uses the imperative mood ("add", not "adding")
- Body wraps at 72 characters
- Body explains "what" and "why", not "how"
Documentation
- [x] In case of new functionality, my PR adds documentation that describes how to use it.
- All the public functions and the classes in the PR contain Javadoc that explain what it does
What do I need to do to get eyes on this? @RyanSkraba ?
Thanks for the contribution and your patience! I asked a question on the JIRA -- what symptoms are you seeing in this bug?
I'm wondering what would happen if passed a record with a UNION field that had two logical types (something compatible like a date and a timestamp-millis). Is it currently unusable, and what would change with this fix?
I'm running into an issue which I think is related but building this PR locally and testing it hasn't fixed the issue.
I have a schema.avdl
@namespace("org.example")
protocol TestService {
record MaybeLogicalTypes {
union {null, date} dateType = null;
union {null, time_ms} timeMsType = null;
union {null, timestamp_ms} timestampMsType = null;
}
MaybeLogicalTypes getMaybeLogicalTypes();
}
I have server code which returns a MaybeLogicalTypes object with all the fields set however when the client code calls
SpecificRequestor.getClient(TestService.class, new HttpTransceiver(url)).getMaybeLogicalTypes()
I get an error when union {null, date} or union {null, time_ms} or union {null, timestamp_ms} is populated.
e.g.
org.apache.avro.AvroRuntimeException: org.apache.avro.AvroRuntimeException: Unknown datum type java.time.Instant: 2021-03-31T11:26:20.123888Z
at org.apache.avro.ipc.specific.SpecificRequestor.readError(SpecificRequestor.java:160) ~[avro-ipc-1.10.2.jar:1.10.2]
at org.apache.avro.ipc.Requestor$Response.getResponse(Requestor.java:566) ~[avro-ipc-1.10.2.jar:1.10.2]
at org.apache.avro.ipc.Requestor$TransceiverCallback.handleResult(Requestor.java:366) ~[avro-ipc-1.10.2.jar:1.10.2]
at org.apache.avro.ipc.Requestor$TransceiverCallback.handleResult(Requestor.java:330) ~[avro-ipc-1.10.2.jar:1.10.2]
at org.apache.avro.ipc.Transceiver.transceive(Transceiver.java:73) ~[avro-ipc-1.10.2.jar:1.10.2]
at org.apache.avro.ipc.Requestor.request(Requestor.java:152) ~[avro-ipc-1.10.2.jar:1.10.2]
at org.apache.avro.ipc.Requestor.request(Requestor.java:101) ~[avro-ipc-1.10.2.jar:1.10.2]
at org.apache.avro.ipc.specific.SpecificRequestor.invoke(SpecificRequestor.java:108) ~[avro-ipc-1.10.2.jar:1.10.2]
This looks related, but not exactly the same issue. This would fix the conversions array in the specific, generated MaybeLogicalTypes, but it looks like the IPC responder doesn't take it into account in any case. Can you create a new JIRA?
Raised AVRO-3102 (sorry for the noise in the PR)
Shouldn't LogicTypes be resolved recursively? Curious if it will be registered in the case of an Array of Logical Types
Shouldn't LogicTypes be resolved recursively? Curious if it will be registered in the case of an Array of Logical Types
There's some discussion about this on the Jira. We've established that this fix works for a union of exactly two elements where one of them is null, but unions with more branches is still an open question. The current conversion resolver doesn't have flexibility to provide different implementations based on the runtime time. As an aside, are nested unions allowed? Even if they are, how are they functionally different from a single union with multiple elements?
Good news: a UNION can't contain other unions, among other limitations (such as not containing two different logical types with the same underlying primitives, or even two ARRAY with different type contents.
It currently looks like an ARRAY of timestamp-millis does generate the methods with the correct type List<Instant>, but does not create the conversions[] array in the specific record.
This is ready for review following our discussions 👍
Nudge @RyanSkraba
I would like to bring this one up again, as we experience issues while using an optional string with logicalType UUID, which is represented as a union of null and the string itself:
{
"name": "id",
"type": [
"null",
{
"type": "string",
"logicalType": "uuid"
}
}
As a result the conversion array is not populated and therefore we're not able to serialize avro messages with the same exception as described in the description.
We're using version 1.11.3. For me it looks like this issue could be fixed with this PR.