Unable to drop distinct matching metrics ?
I expected this to work:
- match: "airflow_local(.+)"
match_type: regex
action: drop
name: "dropped"
- match: "(.+)_task_mem_usage_(.*)"
match_type: regex
action: drop
name: "dropped"
- match: "(.+)_task_cpu_usage_(.*)"
match_type: regex
action: drop
name: "dropped"
But only this works:
- match: "."
match_type: regex
action: drop
name: "dropped"
Is dropping on a single-match basis not possible? I've only been able to get the catch-all to work.
wrote a small unit test in exporter_test.go and it passes, so it should be working.
did you configure a catchall for metrics to be kept at the end of the config?
func TestDropSpecificMetric(t *testing.T) {
config := `
mappings:
- match: metric(.*)drop
action: drop
match_type: regex
name: "dropped_metric"
- match: metric.to.keep
name: "kept_metric"
`
testMapper := &mapper.MetricMapper{}
err := testMapper.InitFromYAMLString(config)
if err != nil {
t.Fatalf("Config load error: %s %s", config, err)
}
events := make(chan event.Events)
defer close(events)
// It's important to use a new registry for this test to avoid interference from other tests.
reg := prometheus.NewRegistry()
exporterEventsActions := prometheus.NewCounterVec(
prometheus.CounterOpts{
Name: "statsd_exporter_events_actions_total",
Help: "The total number of StatsD events by action.",
},
[]string{"action"},
)
reg.MustRegister(exporterEventsActions)
go func() {
ex := NewExporter(reg, testMapper, promslog.NewNopLogger(), exporterEventsActions, eventsUnmapped, errorEventStats, eventStats, conflictingEventStats, metricsCount)
ex.Listen(events)
}()
ev := event.Events{
&event.CounterEvent{
CMetricName: "metric.to.drop",
CValue: 1,
},
&event.CounterEvent{
CMetricName: "metric.to.keep",
CValue: 1,
},
}
events <- ev
events <- event.Events{} // Send empty event to ensure prior events are processed
metrics, err := reg.Gather()
if err != nil {
t.Fatalf("Cannot gather from registry: %v", err)
}
droppedMetricValue := getFloat64(metrics, "dropped_metric", prometheus.Labels{})
if droppedMetricValue != nil {
t.Errorf("Metric 'dropped_metric' was found, but it should have been dropped. Value: %f", *droppedMetricValue)
}
keptMetricValue := getFloat64(metrics, "kept_metric", prometheus.Labels{})
if keptMetricValue == nil {
t.Errorf("Metric 'kept_metric' was not found, but it should have been present.")
} else if *keptMetricValue != 1 {
t.Errorf("Metric 'kept_metric' has value %f, expected 1", *keptMetricValue)
}
// Check that the drop action was recorded
var dropActionMetric dto.Metric
err = exporterEventsActions.WithLabelValues("drop").Write(&dropActionMetric)
if err != nil {
t.Fatalf("Error writing drop action metric: %v", err)
}
if dropActionMetric.Counter.GetValue() != 1 {
t.Errorf("Expected drop action count to be 1, got %f", dropActionMetric.Counter.GetValue())
}
}
Strange.
These are the metrics I was trying to catch:
airflow_local_task_job_task_exit_2563685_hello_world_print_hello_1
airflow_task_mem_usage_hello_world_print_hello_100
airflow_task_cpu_usage_hello_world_print_hello_100
and this is was my initial remap that didn't work.
statsd:
mappingConfig: |-
mappings:
{re-mappings}
...
- match: "airflow_local(.+)"
match_type: regex
action: drop
name: "dropped"
- match: "(.+)_task_mem_usage_(.*)"
match_type: regex
action: drop
name: "dropped"
- match: "(.+)_task_cpu_usage_(.*)"
match_type: regex
action: drop
name: "dropped"
did you configure a catchall for metrics to be kept at the end of the config?
Not at first no. I currently have only a catch all in place right now, because we're generating so many metrics that our collector is throwing "body too large" errors.
Alternatively (while off topic of the issue) I would have liked to have remapped only those metrics
airflow_local_task_job_task_exit_2563685_hello_world_print_hello_1 would become task_exit{dag_id: hello_world, task_id: print_hello} 1, but because they're all underscored, it's difficult to ensure I'm mapping it correctly, so I thought it best to drop it.
and this is was my initial remap that didn't work.
When you say it didn't work what exactly happened? What was the expected behavior vs what is happening?
You can send some samples using echo -n "airflow_task_mem_usage_hello_world_print_hello_100:1|c|#test:true" | nc -u -w1 localhost 8115
But I think I know what is happening here. The name match on the configuration should match the metric name from StatsD, and those airflow metrics are using . as separator and not _ as you see in the final metric name. You have to find out what is the original metric name. Something we do in our case is tag the original metric name as a label using a catch all remap like this:
- match: ".+"
match_type: regex
observer_type: histogram
histogram_options:
native_histogram_bucket_factor: 1.1
native_histogram_max_buckets: 256
name: "$0"
labels:
original_metric_name: "$0"
honor_labels: true
For this metric, for instance, airflow_task_mem_usage_hello_world_print_hello_100 I am quite sure the original name will be something like:
airflow.task.mem_usage.hello_world.print_hello_100 (give it or take).