stackdriver-tools icon indicating copy to clipboard operation
stackdriver-tools copied to clipboard

Emit Spinner results to Stackdriver Monitoring

Open johnsonj opened this issue 8 years ago • 2 comments

The spinner emits a log entry that describes the outcome of it's log loss test. The user should be able to emit a custom metric with this result to Stackdriver Monitoring.

possible metrics:

  • stackdriver-spinner/logs.sent - cumulative total of log messages sent to loggregator
  • stackdriver-spinner/logs.recieved - cumulative total of logs received by the probe
  • stackdriver-spinner/logs.lost - cumulative total of logs never received

labels:

  • director - corresponds to the same director value for Stackdriver Nozzle. Used when multiple PCF instances are logging to a single Stackdriver project. Make an ENV variable for the app?
  • index - index of the Cloud Foundry app (in case the user is running multiple copies)

/cc cloud-ops for implementation/collab @hustons @sahilm @garimasharma (+tom) /cc cre for metrics guidance/awareness @fluffle @knyar

johnsonj avatar Dec 13 '17 18:12 johnsonj

nit: "foundation" not "director" because that's what the Pivotal folks decided to standardise on. But otherwise emitting custom metrics is a good plan, because graphs are much easier for people to consume than logs :-)

fluffle avatar Dec 19 '17 15:12 fluffle

It seems odd to have both logs.received and logs.lost, but I can see how lack of arithmetic operations in SD make that necessary.

It might also be useful to break down logs.received by time period using a metric label (a la Prometheus histograms). For example, if we see 90% of log messages delivered after 5 seconds and 99.9% within 60 seconds, we'll get counters like this:

logs.sent 1000
logs.received{within_seconds="5"} 900
logs.received{within_seconds="10"} 950
logs.received{within_seconds="30"} 970
logs.received{within_seconds="60"} 999
logs.lost 1

knyar avatar Dec 21 '17 11:12 knyar