flyteplugins icon indicating copy to clipboard operation
flyteplugins copied to clipboard

#patch Cloudwatch FluentD

Open Azanul opened this issue 3 years ago • 12 comments

Signed-off-by: Azanul Haque [email protected]

TL;DR

This PR fixes Log Links that do not work with CloudWatch FluentD out of the box

Type

  • [X] Bug Fix
  • [ ] Feature
  • [ ] Plugin

Are all requirements met?

  • [X] Code completed
  • [ ] Smoke tested
  • [ ] Unit tests added
  • [ ] Code documentation added
  • [ ] Any pending items have an associated Issue

Complete description

Replaced the linking cloudwatch url template with correct one.

Tracking Issue

fixes https://github.com/flyteorg/flyte/issues/2635

Follow-up issue

NA

Azanul avatar Oct 23 '22 06:10 Azanul

Thank you for opening this pull request! 🙌

These tips will help get your PR across the finish line:

  • Most of the repos have a PR template; if not, fill it out to the best of your knowledge.
  • Sign off your commits (Reference: DCO Guide).

welcome[bot] avatar Oct 23 '22 06:10 welcome[bot]

@EngHabu, is this the required change, or is there something else that needs to be done?

samhita-alla avatar Oct 24 '22 09:10 samhita-alla

Codecov Report

Merging #293 (8583950) into master (932e97b) will not change coverage. The diff coverage is 100.00%.

@@           Coverage Diff           @@
##           master     #293   +/-   ##
=======================================
  Coverage   63.32%   63.32%           
=======================================
  Files         145      145           
  Lines        9311     9311           
=======================================
  Hits         5896     5896           
  Misses       2872     2872           
  Partials      543      543           
Flag Coverage Δ
unittests 62.74% <100.00%> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
go/tasks/logs/logging_utils.go 89.23% <100.00%> (ø)

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

codecov[bot] avatar Nov 17 '22 22:11 codecov[bot]

Thank you for your help! I went through this exercise recently and what I see is missing at the moment is the hostname. By default, these logstreams include the name of the host they were connected from. There are no template values (e.g. {{.host}}) to use. The fix will involve adding that template like the other ones, populate it from the Pod spec and then use it in the log link.

EngHabu avatar Nov 18 '22 02:11 EngHabu

@EngHabu By host do you mean nodeName ? are you referring to:

Fluent Bit optimized configuration sends logs to kubernetes-nodeName-application.var.log.containers.kubernetes-podName_kubernetes-namespace_kubernetes-container-name-kubernetes-containerID

Azanul avatar Nov 18 '22 18:11 Azanul

Hostname should be provided by the https://github.com/flyteorg/flyteplugins/blob/b0684d97a1cf240f1a44f310f4a79cc21844caa9/go/tasks/pluginmachinery/tasklog/plugin.go#L7-L16, but from what I can see this value is not available or not propagated correctly

andrusha avatar Nov 23 '22 12:11 andrusha

That's correct, the hostname is not provided... that's the work needed

EngHabu avatar Nov 23 '22 14:11 EngHabu

From what I see the pods which Flyte create have local hostname, but what container insights log path expects is the node name, which is not available on the Flyte pod. I imagine that there should be another template parameter which will include node name or --hostname-override could be passed to the Flyte pods on creation to set it to the node hostname instead.

andrusha avatar Nov 25 '22 15:11 andrusha

There seems to be a pod.Spec.NodeName field: https://kubernetes.io/docs/reference/kubernetes-api/workload-resources/pod-v1/#scheduling Is that not enough to populate the URL? And yes, we will need an additional template parameter that resolves to this value.

EngHabu avatar Nov 26 '22 15:11 EngHabu

@EngHabu @andrusha is there someone working on adding the node name? we use fluent-bit for logging as well, and this is breaking our cloudwatch log link.

ashrielbrian avatar Apr 27 '23 04:04 ashrielbrian

I used this aws-for-fluent-bit chart recently which uses a default fluentbit- log stream prefix and doesn't contain hostname. The change to use logSteamNameFilter in this PR may be the safest bet.

andrewwdye avatar Apr 27 '23 13:04 andrewwdye

@ashrielbrian I don't plan working on this myself, locally we use Datadog, so the issue was solved for us by adding custom logging URL towards it, which didn't require a hostname

andrusha avatar May 01 '23 10:05 andrusha