#patch Cloudwatch FluentD
Signed-off-by: Azanul Haque [email protected]
TL;DR
This PR fixes Log Links that do not work with CloudWatch FluentD out of the box
Type
- [X] Bug Fix
- [ ] Feature
- [ ] Plugin
Are all requirements met?
- [X] Code completed
- [ ] Smoke tested
- [ ] Unit tests added
- [ ] Code documentation added
- [ ] Any pending items have an associated Issue
Complete description
Replaced the linking cloudwatch url template with correct one.
Tracking Issue
fixes https://github.com/flyteorg/flyte/issues/2635
Follow-up issue
NA
Thank you for opening this pull request! 🙌
These tips will help get your PR across the finish line:
- Most of the repos have a PR template; if not, fill it out to the best of your knowledge.
- Sign off your commits (Reference: DCO Guide).
@EngHabu, is this the required change, or is there something else that needs to be done?
Codecov Report
Merging #293 (8583950) into master (932e97b) will not change coverage. The diff coverage is
100.00%.
@@ Coverage Diff @@
## master #293 +/- ##
=======================================
Coverage 63.32% 63.32%
=======================================
Files 145 145
Lines 9311 9311
=======================================
Hits 5896 5896
Misses 2872 2872
Partials 543 543
| Flag | Coverage Δ | |
|---|---|---|
| unittests | 62.74% <100.00%> (ø) |
Flags with carried forward coverage won't be shown. Click here to find out more.
| Impacted Files | Coverage Δ | |
|---|---|---|
| go/tasks/logs/logging_utils.go | 89.23% <100.00%> (ø) |
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.
Thank you for your help! I went through this exercise recently and what I see is missing at the moment is the hostname. By default, these logstreams include the name of the host they were connected from. There are no template values (e.g. {{.host}}) to use. The fix will involve adding that template like the other ones, populate it from the Pod spec and then use it in the log link.
@EngHabu By host do you mean nodeName ? are you referring to:
Fluent Bit optimized configuration sends logs to kubernetes-nodeName-application.var.log.containers.kubernetes-podName_kubernetes-namespace_kubernetes-container-name-kubernetes-containerID
Hostname should be provided by the https://github.com/flyteorg/flyteplugins/blob/b0684d97a1cf240f1a44f310f4a79cc21844caa9/go/tasks/pluginmachinery/tasklog/plugin.go#L7-L16, but from what I can see this value is not available or not propagated correctly
That's correct, the hostname is not provided... that's the work needed
From what I see the pods which Flyte create have local hostname, but what container insights log path expects is the node name, which is not available on the Flyte pod. I imagine that there should be another template parameter which will include node name or --hostname-override could be passed to the Flyte pods on creation to set it to the node hostname instead.
There seems to be a pod.Spec.NodeName field: https://kubernetes.io/docs/reference/kubernetes-api/workload-resources/pod-v1/#scheduling
Is that not enough to populate the URL?
And yes, we will need an additional template parameter that resolves to this value.
@EngHabu @andrusha is there someone working on adding the node name? we use fluent-bit for logging as well, and this is breaking our cloudwatch log link.
I used this aws-for-fluent-bit chart recently which uses a default fluentbit- log stream prefix and doesn't contain hostname. The change to use logSteamNameFilter in this PR may be the safest bet.
@ashrielbrian I don't plan working on this myself, locally we use Datadog, so the issue was solved for us by adding custom logging URL towards it, which didn't require a hostname