statsd_exporter icon indicating copy to clipboard operation
statsd_exporter copied to clipboard

Unable to drop distinct matching metrics ?

Open RJNY opened this issue 11 months ago • 3 comments

I expected this to work:

      - match: "airflow_local(.+)"
        match_type: regex
        action: drop
        name: "dropped"

      - match: "(.+)_task_mem_usage_(.*)"
        match_type: regex
        action: drop
        name: "dropped"

      - match: "(.+)_task_cpu_usage_(.*)"
        match_type: regex
        action: drop
        name: "dropped"

But only this works:

      - match: "."
        match_type: regex
        action: drop
        name: "dropped"

Is dropping on a single-match basis not possible? I've only been able to get the catch-all to work.

RJNY avatar May 08 '25 18:05 RJNY

wrote a small unit test in exporter_test.go and it passes, so it should be working.

did you configure a catchall for metrics to be kept at the end of the config?

func TestDropSpecificMetric(t *testing.T) {
	config := `
mappings:
- match: metric(.*)drop
  action: drop
  match_type: regex
  name: "dropped_metric"

- match: metric.to.keep
  name: "kept_metric"
`
	testMapper := &mapper.MetricMapper{}
	err := testMapper.InitFromYAMLString(config)
	if err != nil {
		t.Fatalf("Config load error: %s %s", config, err)
	}

	events := make(chan event.Events)
	defer close(events)

	// It's important to use a new registry for this test to avoid interference from other tests.
	reg := prometheus.NewRegistry()
	exporterEventsActions := prometheus.NewCounterVec(
		prometheus.CounterOpts{
			Name: "statsd_exporter_events_actions_total",
			Help: "The total number of StatsD events by action.",
		},
		[]string{"action"},
	)
	reg.MustRegister(exporterEventsActions)

	go func() {
		ex := NewExporter(reg, testMapper, promslog.NewNopLogger(), exporterEventsActions, eventsUnmapped, errorEventStats, eventStats, conflictingEventStats, metricsCount)
		ex.Listen(events)
	}()

	ev := event.Events{
		&event.CounterEvent{
			CMetricName: "metric.to.drop",
			CValue:      1,
		},
		&event.CounterEvent{
			CMetricName: "metric.to.keep",
			CValue:      1,
		},
	}

	events <- ev
	events <- event.Events{} // Send empty event to ensure prior events are processed

	metrics, err := reg.Gather()
	if err != nil {
		t.Fatalf("Cannot gather from registry: %v", err)
	}

	droppedMetricValue := getFloat64(metrics, "dropped_metric", prometheus.Labels{})
	if droppedMetricValue != nil {
		t.Errorf("Metric 'dropped_metric' was found, but it should have been dropped. Value: %f", *droppedMetricValue)
	}

	keptMetricValue := getFloat64(metrics, "kept_metric", prometheus.Labels{})
	if keptMetricValue == nil {
		t.Errorf("Metric 'kept_metric' was not found, but it should have been present.")
	} else if *keptMetricValue != 1 {
		t.Errorf("Metric 'kept_metric' has value %f, expected 1", *keptMetricValue)
	}

	// Check that the drop action was recorded
	var dropActionMetric dto.Metric
	err = exporterEventsActions.WithLabelValues("drop").Write(&dropActionMetric)
	if err != nil {
		t.Fatalf("Error writing drop action metric: %v", err)
	}
	if dropActionMetric.Counter.GetValue() != 1 {
		t.Errorf("Expected drop action count to be 1, got %f", dropActionMetric.Counter.GetValue())
	}
}

pedro-stanaka avatar May 08 '25 20:05 pedro-stanaka

Strange.

These are the metrics I was trying to catch:

airflow_local_task_job_task_exit_2563685_hello_world_print_hello_1
airflow_task_mem_usage_hello_world_print_hello_100
airflow_task_cpu_usage_hello_world_print_hello_100

and this is was my initial remap that didn't work.

statsd:
  mappingConfig: |-
    mappings:
      {re-mappings}
      ...
      
      - match: "airflow_local(.+)"
        match_type: regex
        action: drop
        name: "dropped"

      - match: "(.+)_task_mem_usage_(.*)"
        match_type: regex
        action: drop
        name: "dropped"

      - match: "(.+)_task_cpu_usage_(.*)"
        match_type: regex
        action: drop
        name: "dropped"

did you configure a catchall for metrics to be kept at the end of the config?

Not at first no. I currently have only a catch all in place right now, because we're generating so many metrics that our collector is throwing "body too large" errors.

Alternatively (while off topic of the issue) I would have liked to have remapped only those metrics airflow_local_task_job_task_exit_2563685_hello_world_print_hello_1 would become task_exit{dag_id: hello_world, task_id: print_hello} 1, but because they're all underscored, it's difficult to ensure I'm mapping it correctly, so I thought it best to drop it.

RJNY avatar May 09 '25 14:05 RJNY

and this is was my initial remap that didn't work.

When you say it didn't work what exactly happened? What was the expected behavior vs what is happening?

You can send some samples using echo -n "airflow_task_mem_usage_hello_world_print_hello_100:1|c|#test:true" | nc -u -w1 localhost 8115

But I think I know what is happening here. The name match on the configuration should match the metric name from StatsD, and those airflow metrics are using . as separator and not _ as you see in the final metric name. You have to find out what is the original metric name. Something we do in our case is tag the original metric name as a label using a catch all remap like this:

      - match: ".+"
        match_type: regex
        observer_type: histogram
        histogram_options:
          native_histogram_bucket_factor: 1.1
          native_histogram_max_buckets: 256
        name: "$0"
        labels:
          original_metric_name: "$0"
        honor_labels: true

For this metric, for instance, airflow_task_mem_usage_hello_world_print_hello_100 I am quite sure the original name will be something like:

airflow.task.mem_usage.hello_world.print_hello_100 (give it or take).

pedro-stanaka avatar May 12 '25 09:05 pedro-stanaka