[Feature] Add virtual MQ analysis for native traces.
Search before asking
- [X] I had searched in the issues and found no similar feature requirement.
Description
As we have the virtual database and cache analysis in 9.3.0, let's bring the last missing core analysis, virtual MQ, into the backend analysis.
As we designed years ago, span layer includes the following options
public enum SpanLayer {
DB(1), RPC_FRAMEWORK(2), HTTP(3), MQ(4), CACHE(5);
MQ span layer represents a queue server with consumer and producer sides, such as, Kafka, RocketMQ, and Pulsar. SkyWalking Java agent has had plugins for these typical and widely used MQ's Java clients for years too.
Now, let's analyze their access load(producing/consuming load), such as X messages per second, and other typical metrics There is one special about time, MQ is able to consume and produce messages in bulk mode, and some queue clients are using pulling mode(typically Kafka), so the span could only include pulling time, but, meanwhile, some pushing mode client span could including all further processes' time.
Also, endpoint_mq_consume_count and endpoint_mq_consume_latency are purely analyzed through OAL, even having tag match expression, we should consider merging it in a consistent way like existing VirtualServiceProcessor implementations.
Use case
No response
Related issues
No response
Are you willing to submit a PR?
- [ ] Yes I am willing to submit a PR!
Code of Conduct
- [X] I agree to follow this project's Code of Conduct
Please assign to me
Don't OAL support labeled AVG metrics ?
I'm analyzing MQ metrics , I think it's better to provide transmission latency metrics labeled by queue or topic , as following :
mq_transmission_latency = from(MessageQueueAccess.transmissionLatency).filter(operation == MessageQueueOperation.Consume).avgLabeled(MessageQueueAccess.queue);
But AvgLabeledFunction is in meter package , and annotated by @MeterFunction.
MetricsHolder only initialize Class annotated by @MetricsFunction
https://github.com/apache/skywalking/blob/5abe6ceb1ff0938be917f8b440c5eb3590cf31db/oap-server/oal-rt/src/main/java/org/apache/skywalking/oal/rt/parser/MetricsHolder.java#L41-L45
Could I implement labeled AVG metrics in OAL if It's not supported ?
If the answer of the above question is Yes , Should I add a new Labeled AVG metrics class to metrics package , instead of using AvgLabeledFunction class in meter package ?
The existing function is in the meter system because its implementation is designed in this way. MAL function(meter function) is designed for statistics aggregation and merging. Meanwhile, the OAL function targets for raw data, typical in agent trace and mesh ALS, it is one value per request(latency per request).
When you dig deeper, we would notice meter function can't be applied in OAL scenario.
At last, of course, you could implement a new label based avg OAL function, just in a different way.
I have submit draft PR for discussion about labeled metrics https://github.com/apache/skywalking/pull/9813 .
mq_transmission_latency = from(MessageQueueAccess.transmissionLatency).filter(operation == MessageQueueOperation.Consume).avgLabeled(MessageQueueAccess.queue);
Besides that tech perspective discussion, I want to back to this original proposal. Could you explain why labeled metrics are meaningful for OAL? In the Virtual MQ case, queue or topic name should be an endpoint in today's concept. Then what else would be a label?
OAL doesn't cover the label system at the beginning, because usually entity is already assigned out of labels(service name is a field of the source), and the type of various metrics(common usages of labels in meter system) are actually mapping to various sources. In years, there is hard to find more dimensions we need to build labels for OAL, so it ends at what you see today.
Emm , I want to provide consumer count , producer count , latency , etc metrics labeled by topic , queue . And I don't plan to provide endpoint , I think .
Why don't provide an endpoint? The endpoint is designed for this actually. The topic of a queue is the same as HTTP URI.
Emm , just for reducing user's operation and ui page in browser .
That is not the point of using a label system. Label system(even in Prometheus) exists because they don't want the entity concept, and want to merge the values in one metric to visualize. We have the entity concept, and UI support to set multiple metrics for one graph. These are just two solutions to one problem. But neither of the solutions is trying to reduce browser operation. That is UI/UX design thing
Also, you could list the key metrics(if there are few) on the table of endpoints directly, then users don't have to jump into the dashboard. And we could try to disable jump into if necessary.
Let's focus on the right direction, rather than mixing two purposes, which could make the project hard to maintain and confused to others.
OK , provide endpoint