exporter/stackdriver: not auto detecting GKE labels
What version of OpenCensus are you using?
"@opencensus/core": "0.0.19",
"@opencensus/exporter-stackdriver": "0.0.19",
What version of Node are you using?
12.14.1
What did you do?
We have a simple setInterval that monitors the nodejs memory usage. (basically a hello world example)
(this is a hapi.js plugin)
const { globalStats, MeasureUnit, AggregationType } = require('@opencensus/core');
const { StackdriverStatsExporter } = require('@opencensus/exporter-stackdriver');
const EXPORT_INTERVAL = process.env.EXPORT_INTERVAL || 60;
const MEMORY_RSS = globalStats.createMeasureInt64(
'memory_rss',
MeasureUnit.BYTE,
'Total memory used'
);
globalStats.registerView(globalStats.createView(
'nodejs_memory_rss',
MEMORY_RSS,
AggregationType.LAST_VALUE,
[],
'Total memory used by this process',
));
exports.register = async () => {
const projectId = 'hardcoded-project-name-here';
const exporter = new StackdriverStatsExporter({
projectId,
period: EXPORT_INTERVAL * 1000,
});
// Pass the created exporter to Stats
globalStats.registerExporter(exporter);
setInterval(() => {
const memoryStats = process.memoryUsage();
globalStats.record([{
measure: MEMORY_RSS,
value: memoryStats.rss,
}]);
}, 10000);
};
exports.name = 'monitor';
What did you expect to see?
The stats should automatically have kubernetes (pod, container, etc) labels when deployed in a GKE container.
What did you see instead?
It does not make a difference if running locally or on GKE. Both stats end up in the "global" resource without any additional labels (just project_id)
Additional context
- The container has the env variable
KUBERNETES_SERVICE_HOST=10.111.0.1 - We do not use
GOOGLE_APPLICATION_CREDENTIALSas it works without that on GKE
Any help is welcome. We would like to debug this, but don't see an easy way. Is there some verbose logging mode in this exporter?
I think i've found the problem.
The container needs the NAMESPACE and CONTAINER_NAME environment variables. Otherwise this line will reset the resource type and all labels to some default stuff.
I don't get why this line is there. Seems to make everything worse for me.. The required env variables should at least be documented.
Thanks for reporting this!
AFAIK Stackdriver exporter required us to pass all the labels with a value, in case of GKE these labels. If any of the expected label is missing, the exporter will be unhappy and throw an exception.
Something like this:
One or more TimeSeries could not be written: The set of resource labels is incomplete.
Missing labels: (<label>).: timeSeries[<number>]
This line is added as a preemptive measure to handle the missing labels case.
Ok, that makes sense now. Thanks for the response :)
My opinion:
I would prefer it if the namespace and container name would be set to "N/A" just to see that the detection is working. (better than nothing)
Also: this should be documented ..
stumbled over this issue, had the same, linking related issue that has more details https://github.com/census-instrumentation/opencensus-python/issues/796
tl;dr: GKE somewhen stopped populating containers with NAMESPACE and CONTAINER_NAME so you now have to add that yourself to Deployment env like
- name: NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: CONTAINER_NAME
value: my-awesome-service