Sampling override with http.response.status_code doesn't work
Expected behavior
If you add a sampling override that filters out all requests with a specific HTTP response status, those requests shouldn't be shown in Application Insights.
Actual behavior
HTTP requests with the specified status code are shown in Application Insights.
To Reproduce
- Create a simple Spring Boot application with the health actuator endpoint enabled
- Create a
applicationinsights.jsonthat includes the below sampling setting:
"sampling": {
"percentage": 100,
"overrides": [
{
"telemetryType": "request",
"attributes": [
{
"key": "http.response.status_code",
"value": 200,
"matchType": "strict"
}
],
"percentage": 0
}
]
},
- Do a GET request to the http://localhost:8080/actuator/health endpoint
- It should return a 200 response code with the following payload:
{ "status": "UP" } - This request shouldn't be shown in Application Insights
- It should return a 200 response code with the following payload:
- Do a GET request to the http://localhost:8080/actuator/invalid endpoint
- This request should be shown in Application Insights because you get a 404 error
System information
Please provide the following information:
- SDK Version 3.5.1 (Telemetry SDK Version: 1.35.0)
- OS type and version: Windows 11
- Application Server type and version (if applicable): Tomcat
- Using spring-boot? Yes
- Additional relevant libraries (with version, if applicable): n/a
Logs
2024-04-22 10:28:52.795+02:00 DEBUG c.m.a.a.i.exporter.AgentSpanExporter - exporting span: SpanData{spanContext=ImmutableSpanContext{traceId=0dad507a37d7da13c3a81e9139723846, spanId=4eadcbc1fe5c4e8b, traceFlags=01, traceState=ArrayBasedTraceState{entries=[]}, remote=false, valid=true}, parentSpanContext=ImmutableSpanContext{traceId=00000000000000000000000000000000, spanId=0000000000000000, traceFlags=00, traceState=ArrayBasedTraceState{entries=[]}, remote=false, valid=false}, resource=Resource{schemaUrl=null, attributes={service.name="appinsights", telemetry.sdk.language="java", telemetry.sdk.name="opentelemetry", telemetry.sdk.version="1.35.0"}}, instrumentationScopeInfo=InstrumentationScopeInfo{name=io.opentelemetry.tomcat-10.0, version=2.1.0-alpha, schemaUrl=null, attributes={}}, name=GET /actuator/health, kind=SERVER, startEpochNanos=1713774532712869100, endEpochNanos=1713774532774249100, attributes=AttributesMap{data={thread.id=65, http.request.method=GET, http.route=/actuator/health, http.response.status_code=200, network.peer.address=127.0.0.1, server.address=localhost, client.address=127.0.0.1, url.path=/actuator/health, server.port=8080, network.protocol.version=1.1, user_agent.original=Apache-HttpClient/4.5.14 (Java/17.0.10), network.peer.port=60098, url.scheme=http, thread.name=http-nio-8080-exec-4, applicationinsights.internal.is_pre_aggregated=true}, capacity=128, totalAddedValues=15}, totalAttributeCount=15, events=[], totalRecordedEvents=0, links=[], totalRecordedLinks=0, status=ImmutableStatusData{statusCode=UNSET, description=}, hasEnded=true}
2024-04-22 10:28:57.251+02:00 DEBUG c.a.m.o.e.i.p.TelemetryItemExporter - sending telemetry to ingestion service:
{"ver":1,"name":"Metric","time":"2024-04-22T08:28:57.251Z","iKey":"ec7d4b96-3d1e-405a-8d5f-0d90258b5785","tags":{"ai.internal.sdkVersion":"java:3.5.1","ai.cloud.roleInstance":"...","ai.cloud.role":"appinsights"},"data":{"baseType":"MetricData","baseData":{"ver":2,"metrics":[{"name":"_OTELRESOURCE_","value":0.0}],"properties":{"telemetry.sdk.language":"java","service.name":"appinsights","service.instance.id":"...","telemetry.sdk.version":"1.35.0","telemetry.sdk.name":"opentelemetry"}}}}
{"ver":1,"name":"Request","time":"2024-04-22T08:28:52.712Z","iKey":"ec7d4b96-3d1e-405a-8d5f-0d90258b5785","tags":{"ai.internal.sdkVersion":"java:3.5.1","ai.operation.id":"0dad507a37d7da13c3a81e9139723846","ai.cloud.roleInstance":"...","ai.operation.name":"GET /actuator/health","ai.location.ip":"127.0.0.1","ai.cloud.role":"appinsights","ai.user.userAgent":"Apache-HttpClient/4.5.14 (Java/17.0.10)"},"data":{"baseType":"RequestData","baseData":{"ver":2,"id":"4eadcbc1fe5c4e8b","name":"GET /actuator/health","duration":"00:00:00.061380","success":true,"responseCode":"200","url":"http://localhost:8080/actuator/health","properties":{"_MS.ProcessedByMetricExtractors":"True"}}}}
@zwilling79 you can use OpenTelemetry Extension to filter telemetry based on http.reponse.status_code. Here is an example how to filter out telemetry based on duration . You can do something similar.
Hm, this may work. Nonetheless, I would prefer to have this part of the configuration file so that it can be easily adjusted, especially if it is specific to certain environments. For instance, today I just want to filter out the health checks and the prometheus endpoint requests which have a response code of 200. Tomorrow I want to filter out some additional business application endpoints that have a response code of 200. To compile/package/distribute the otel extension JAR for such changes looks a bit overkill. Furthermore, if you want to use different configurations for different environments, you have to maintain different otel extension JARs or add more complexity to read/evaluate further configuration files.
I think, the problem in the code is that the values of the sampling override attributes are always treated as strings but the actual attribute is of type integer. So it is perhaps similar to #3378.
Only attributes set at the start of the span are available for sampling, so attributes such as http.response.status_code or request duration won't work for sampling.
Alternatively, you could try to use DCR. A tutorial: https://learn.microsoft.com/en-us/azure/azure-monitor/logs/tutorial-workspace-transformations-portal
Only attributes set at the start of the span are available for sampling, so attributes such as http.response.status_code or request duration won't work for sampling.
It is very confusing which attributes are available for sampling since 3.5.0. The docs point you to the "exporting span" line but that line is basically useless as it includes the http.status_code and is not printing for example url.full which i am able to use even though it is not included in the "exporting span" line. While the next line warns you that only attributes at the start of the span are available for sampling it would be great to know which attributes are available when i set my loglevel to debug.
Have done exactly the same. Enabled the debug logging to see on which fields I can filter on. And because I saw http.response.status_code=200 in the attributes list, I thought I could filter on this.
we are thinking to add a warning during startup if there are sampling override attributes used which are known not to be available at span start such as http.response.status_code