Consolidate unmarshalling and parsing in a single pass
Motivation and Context
Change the way to unmarshall JSON payloads. Before this change the process was done in two steps, first we parse the JSON input into a JsonNode structure than then is traversed to unmarshall it into a SdkPojo. After this change, there's a single pass that builds the SdkJson as we parse the input.
Changes
SdkPojo
The SdkPojo interface was changed to add a new method, sdkFieldNameToField() that returns a Map<String, SdkField<?>> which allow us to lookups the field metadata for a given field name. We also changed the code generation to create this map on top of the list of fields that we were creating before.
JsonUnmarshallingParser
A new internal class JsonUnmarshallingParser was added that is used by JsonProtocolUnmarshaller to unmarshall the input. This class builds a JSON parer and builds the expected java types as the stream of tokens are read. Unmarshalling of simple types is still done using the registered unmarshallers.
Benchmarks results
This change improves the performance of unmarshalling in general. Shapes with labeled simple types (as in members of structures) benefit the most from this change in plain JSON, whereas collections of datapoints (e.g., list of doubles) do not seem to benefit much, mainly because of how the parsing from strings dominates the time.
V2DynamoDbAttributeValue
Below is the before result of running the V2DynamoDbAttributeValue.getItem benchmark
Benchmark (testItem) Mode Cnt Score Error Units
V2DynamoDbAttributeValue.getItem TINY avgt 5 736.556 ± 48.044 ns/op
V2DynamoDbAttributeValue.getItem SMALL avgt 5 2609.546 ± 98.809 ns/op
V2DynamoDbAttributeValue.getItem HUGE avgt 5 20363.496 ± 1149.802 ns/op
And the after results,
Benchmark (testItem) Mode Cnt Score Error Units
V2DynamoDbAttributeValue.getItem TINY avgt 5 402.055 ± 8.297 ns/op
V2DynamoDbAttributeValue.getItem SMALL avgt 5 919.009 ± 12.303 ns/op
V2DynamoDbAttributeValue.getItem HUGE avgt 5 6400.011 ± 146.718 ns/op
And the after V2 results, updated after enabling and using Jackson's fast float parsing, notice how this change improves the performance as there's no longer need to create intermediate objects.
Benchmark (testItem) Mode Cnt Score Error Units
V2DynamoDbAttributeValue.getItem TINY avgt 5 386.915 ± 25.105 ns/op
V2DynamoDbAttributeValue.getItem SMALL avgt 5 922.134 ± 3.386 ns/op
V2DynamoDbAttributeValue.getItem HUGE avgt 5 5948.790 ± 19.249 ns/op
The improvements comparing before and after is
Benchmark (testItem) Before After Improvement
V2DynamoDbAttributeValue.getItem TINY 736.556 402.055 1.80x
V2DynamoDbAttributeValue.getItem SMALL 2609.546 919.009 2.70x
V2DynamoDbAttributeValue.getItem HUGE 20363.496 6400.011 3.32x
Edit, the improvements comparing before and after V2 is
Benchmark (testItem) Before After Improvement
V2DynamoDbAttributeValue.getItem TINY 736.556 386.915 1.90x
V2DynamoDbAttributeValue.getItem SMALL 2609.546 922.134 2.82x
V2DynamoDbAttributeValue.getItem HUGE 20363.496 5948.790 3.42x
JsonMarshallerBenchmark
This change also shows a ~2x performance improvements of the JsonMarshallerBenchmark.unmarshall for RPCv2, but, not so much for AWS JSON. Below is the before result of the benchmark
Benchmark (protocol) (size) Mode Cnt Score Error Units
JsonMarshallerBenchmark.unmarshall smithy-rpc-v2 small avgt 5 2053.930 ± 151.090 ns/op
JsonMarshallerBenchmark.unmarshall smithy-rpc-v2 medium avgt 5 3080.472 ± 262.710 ns/op
JsonMarshallerBenchmark.unmarshall smithy-rpc-v2 big avgt 5 13009.513 ± 1116.719 ns/op
JsonMarshallerBenchmark.unmarshall aws-json small avgt 5 5259.660 ± 68.543 ns/op
JsonMarshallerBenchmark.unmarshall aws-json medium avgt 5 10676.408 ± 113.583 ns/op
JsonMarshallerBenchmark.unmarshall aws-json big avgt 5 43582.916 ± 316.564 ns/op
And the after results,
Benchmark (protocol) (size) Mode Cnt Score Error Units
JsonMarshallerBenchmark.unmarshall smithy-rpc-v2 small avgt 5 1096.103 ± 21.391 ns/op
JsonMarshallerBenchmark.unmarshall smithy-rpc-v2 medium avgt 5 1978.710 ± 145.267 ns/op
JsonMarshallerBenchmark.unmarshall smithy-rpc-v2 big avgt 5 7171.007 ± 32.614 ns/op
JsonMarshallerBenchmark.unmarshall aws-json small avgt 5 5230.546 ± 80.274 ns/op
JsonMarshallerBenchmark.unmarshall aws-json medium avgt 5 11179.496 ± 964.445 ns/op
JsonMarshallerBenchmark.unmarshall aws-json big avgt 5 41770.648 ± 1723.839 ns/op
And the after V2 results, updated after enabling and using Jackson's fast float parsing, notice how RPCv2 also improves as there's no longer need to create intermediate objects.
JsonMarshallerBenchmark.unmarshall smithy-rpc-v2 small avgt 5 870.653 ± 17.654 ns/op
JsonMarshallerBenchmark.unmarshall smithy-rpc-v2 medium avgt 5 1361.450 ± 9.747 ns/op
JsonMarshallerBenchmark.unmarshall smithy-rpc-v2 big avgt 5 4614.679 ± 17.724 ns/op
JsonMarshallerBenchmark.unmarshall aws-json small avgt 5 2792.786 ± 12.847 ns/op
JsonMarshallerBenchmark.unmarshall aws-json medium avgt 5 5647.308 ± 56.101 ns/op
JsonMarshallerBenchmark.unmarshall aws-json big avgt 5 22556.135 ± 1178.098 ns/op
The improvements comparing before and after is
Benchmark (protocol) (size) Before After Improvement
JsonMarshallerBenchmark.unmarshall smithy-rpc-v2 small 2053.930 1096.103 1.70x
JsonMarshallerBenchmark.unmarshall smithy-rpc-v2 medium 3080.472 1978.710 1.57x
JsonMarshallerBenchmark.unmarshall smithy-rpc-v2 big 13009.513 7171.007 1.86x
JsonMarshallerBenchmark.unmarshall aws-json small 5259.660 5230.546 1.01x
JsonMarshallerBenchmark.unmarshall aws-json medium 10676.408 11179.496 0.97x
JsonMarshallerBenchmark.unmarshall aws-json big 43582.916 41770.648 1.04x
Edit, the improvements comparing before and after V2 is
Benchmark (protocol) (size) Before After Improvement
JsonMarshallerBenchmark.unmarshall smithy-rpc-v2 small 2053.930 870.653 2.35x
JsonMarshallerBenchmark.unmarshall smithy-rpc-v2 medium 3080.472 1361.450 2.26x
JsonMarshallerBenchmark.unmarshall smithy-rpc-v2 big 13009.513 4614.679 2.81x
JsonMarshallerBenchmark.unmarshall aws-json small 5259.660 2792.786 1.88x
JsonMarshallerBenchmark.unmarshall aws-json medium 10676.408 5647.308 1.89x
JsonMarshallerBenchmark.unmarshall aws-json big 43582.916 22556.135 1.93x
Modifications
Testing
Screenshots (if appropriate)
Types of changes
- [ ] Bug fix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
Checklist
- [ ] I have read the CONTRIBUTING document
- [ ] Local run of
mvn installsucceeds - [ ] My code follows the code style of this project
- [ ] My change requires a change to the Javadoc documentation
- [ ] I have updated the Javadoc documentation accordingly
- [ ] I have added tests to cover my changes
- [ ] All new and existing tests passed
- [ ] I have added a changelog entry. Adding a new entry must be accomplished by running the
scripts/new-changescript and following the instructions. Commit the new file created by the script in.changes/next-releasewith your changes. - [ ] My change is to implement 1.11 parity feature and I have updated LaunchChangelog
License
- [ ] I confirm that this pull request can be released under the Apache 2 license
LGTM, can we run this in the canaries?
Quality Gate failed
Failed conditions
36.9% Coverage on New Code (required ≥ 80%)
C Reliability Rating on New Code (required ≥ A)
See analysis details on SonarCloud
Catch issues before they fail your Quality Gate with our IDE extension
SonarLint