Unexpected metric names when cluster name contains dot
Title: Unexpected metric names when cluster name contains dot
Description:
When cluster name contains dot, cluster-specific metric names are unexpected.
Repro steps:
Create a cluster with dots in its name, e.g.
foo.example.com, and then look for cluster metrics likeenvoy.cluster.upstream_rq_xx
Expected:
The metric should be named
envoy.cluster.upstream_rq_xxwith tagenvoy_cluster_name=foo.example.com
Actual:
The metric is named
envoy.cluster.example.com.upstream_rq_xxwith tagenvoy_cluster_name=foo
cc @jmarantz
Any chance we could just make it illegal to have dots in cluster names?
From my perspective that's fine, but any time we try to do anything like this, someone complains. Might be worth it to try to gauge some additional opinions on this.
Any chance we could just make it illegal to have dots in cluster names?
I wouldn't want to have dots either, until I had to workaround #5238
@yuan-stripe #5238 would be pretty trivial to fix/add. Do you want to do that?
Sure, I will take it.
@yuan-stripe please let me know when you have a PR for #5238 ready. I’ll try to review it. Thanks!
@mattklein123 @dio PR #5275 is submitted to add support for #5238
@yuan-stripe got it. Thanks! I'll take a look.
This seems like a dupe of #4357
Confirmed, still the case with ver. 1.29.2.
E.g. for the routing cluster name www.foo1.com, only the first part until dot, i.e. www, is reported as envoy_cluster_name. Whereas the part after the first dot is becoming a part of the name of the given metric, e.g. envoy_cluster_foo1_com_upstream_rq_total. For another cluster, www.foo2.com, also only the www is reported as envoy_cluster_name, which is quite misleading, because now two different clusters were reporting the same name, www.
A workaround for us was to replace dots with underscores, e.g. www_foo1_com. Then the metric name is always e.g. envoy_cluster_upstream_rq_total, and envoy_cluster_name indicates the full cluster name, e.g. www_foo1_com.