NIFI-12854: Add S3 user metadata via flowfile attributes
Summary
NIFI-12854 The PutS3Object processor currently supports adding user metadata to objects via dynamic properties. This PR adds support for using FlowFile attributes to generate S3 user metadata.
Tracking
Please complete the following tracking steps prior to pull request creation.
Issue Tracking
- [x] Apache NiFi Jira issue created
Pull Request Tracking
- [x] Pull Request title starts with Apache NiFi Jira issue number, such as
NIFI-00000 - [x] Pull Request commit message starts with Apache NiFi Jira issue number, as such
NIFI-00000
Pull Request Formatting
- [x] Pull Request based on current revision of the
mainbranch - [x] Pull Request refers to a feature branch with one commit containing changes
Verification
Please indicate the verification steps performed prior to pull request creation.
Build
- [x] Build completed using
mvn clean install -P contrib-check - [x] JDK 21
Licensing
- [x] New dependencies are compatible with the Apache License 2.0 according to the License Policy
- [x] New dependencies are documented in applicable
LICENSEandNOTICEfiles
Documentation
- [x] Documentation formatting appears as expected in rendered files
Hey @gforeman02 thanks for putting this forward. I'm not sure that I agree that it's a good idea to add this, though. In general, we should not support multiple different ways to do the same thing. Doing so just leads to confusion. Is there a reason that the existing method does not work for you?
Hello @markap14 , thanks for posting. Yes, the existing method (dynamic properties) requires knowing what the user metadata keys will be in advance. My client has a scenario where one flow has metadata key A and another that has key B and this is only known by reading the flowfile attributes. It would be great for the processor to work in the same way it does for tags.
@markap14 one other approach could be to have a special dynamic property that the processor uses to read key/value pairs from the dynamic property value. Maybe something like a key "metadata.json" with a value of { "a": "a_val", "b": "b_val" .... }.
@markap14 i created a custom s3 nar with the following mod:
final Map<String, String> userMetadata = new HashMap<>();
for (final Entry<PropertyDescriptor, String> entry : context.getProperties().entrySet()) {
if (entry.getKey().isDynamic()) {
final String value = context.getProperty(
entry.getKey()).evaluateAttributeExpressions(ff).getValue();
if( "metadata.json".equals(entry.getKey().getName())){
ObjectReader reader = new ObjectMapper().readerFor(Map.class);
Map<String, String> map = reader.readValue(value);
userMetadata.putAll(map);
} else {
userMetadata.put(entry.getKey().getName(), value);
}
}
}
I am happy to modify this PR and resubmit it for consideration.
@exceptionfactory thanks for the idea on an approach. I guess a decision needs to be made:
Should the PutS3Object processor support dynamic metadata keys?
+1 here for supporting dynamic metadata keys since I have a scenario where this is used. @markap14 if the concern is unique to me then I have a workaround in place.
Thanks for the reply @gforeman02, if you willing to rework the PR along the lines of using dynamic properties with wildcards as a way to support dynamic metadata keys, that sounds like a good way forward, without introducing new properties.
@gforeman02 Heads up: you have merge conflicts with main that need to be resolved on your branch.
@gforeman02 Given the current conflicts and the different direction, I'm closing this pull request for now. Feel free to reopen when you are ready with a different approach, or open a new PR if you prefer. Thanks!