Documentation improvements
I want to make a 'super issue' to track the progress.
I guess we should cover several topics:
- The anatomy of the otel4s library (See #117)
- How a library author can instrument a library. We can provide some guidelines (See #118)
- How an end user can manually instrument a service (actually, it intersects with point 2)
- How to collect OpenTelemetry metrics and traces by third parties: Honeycomb, Grafana, Jaeger, etc
- How to customize OpenTelemetry: histogram customization, JVM metrics exporters, etc
Currently, I am thinking of the following structure:
- Design notes:
- [x] Modules structure and dependencies (describe why have 10+ modules)
- [x] Tracing context propagation
- Instrumentation
- [x] Tracing
- [x] Metrics
- [ ] Cross-service propagation
- [x] Interop with JVM instrumented libraries
- [x] JVM runtime metrics
- Customization
- [x] Histogram custom buckets
- [x] ~Histogram custom bucket (file-based configuration) #308~
- Visualization (or exporting?)
- [x] Jaeger - collecting traces
- [x] Honeycomb - metrics and traces
- [ ] Grafana Cloud - metrics and traces
- [x] Grafana + Prometheus - metrics and traces
What do you think @rossabaker @zmccoy @armanbilge?
Just came across this issue, I had some spare time and made an example project using Grafana for metrics and traces here. Not sure if that's what you were expecting though as it still uses jaeger to store the traces, I could adapt the example by replacing jaeger by tempo, or even just storing traces in prometheus if that's better. Let me know what you think !
It would be nice to cover OpenTelemetry + Prometheus too. Because a lot of systems are using Prometheus to collect metrics of Kubernetes cluster, for example.
For the simplest case, I was using https://grafana.com/docs/grafana-cloud/data-configuration/otlp/send-data-otlp/.
Just took a look at the link you sent me, this seem pretty specific to Grafana Cloud. They got a free plan though, I'll try to find some time to update my code and adapt it to this use case :+1:
@keuhdall As you mentioned, the approach I sent above is suitable only for Grafana Cloud.
I believe we shouldn't change your code. A lot of people are using on-premise Grafana and Prometheus and your example fully covers this scenario.
I added Grafana + Prometheus - metrics and traces item to the list. So we can keep your example and add further documentation there.
I can probably still get rid of jaeger in order to push traces directly to prometheus though. Should I make a PR to update the microsite with the example ?
It would be nice!
By the way, in the current example, the counter is recreated on every request https://github.com/keuhdall/otel4s-grafana-example/blob/79223a24c5b121ff2ef39726f2ef69e2fac5e644/src/main/scala/ExampleService.scala#L43-L45.
I assume it should be:
def apply[F[_]: Async: Tracer: Meter: Random](
minLatency: Int,
maxLatency: Int,
bananaPercentage: Int
): F[ExampleService[F]] =
metricsProvider
.counter("RemoteApi.fruit.count")
.withDescription("Number of fruits returned by the API.")
.create
.map { counter =>
new ExampleService[F] {
...
}
}
Oh nice catch! I'll update my code. Also, I tried finding a way to export traces to prometheus, but it doesn't seem to be possible. At best, spanmetrics can be pushed into prometheus, but not traces themselves. I can either keep jaeger or try to replace it by tempo since it's also developed by grafana labs, let me know what you think is best.
I keen to see Tempo example, but I'm not sure how popular is it.
I just added a tempo example on this branch. I'm not sure it's particularly useful to document this example in detail as it is pretty much the same as using jaeger when it comes to displaying traces in grafana (it's just a matter of choosing tempo as datasource instead of jeager). I'll try making a PR during the coming week to update the documentation with the initial grafana / prometheus / jaeger example if that sounds good to you
For the JVM metrics, would we consider just linking to https://github.com/open-telemetry/opentelemetry-java-instrumentation/tree/main/instrumentation/runtime-telemetry ? That works perfectly by just registering new instruments using the library, and does not require any agent on the side.
For Java 17: https://github.com/open-telemetry/opentelemetry-java-instrumentation/tree/main/instrumentation/runtime-telemetry/runtime-telemetry-java17/library
For the JVM metrics, would we consider just linking to https://github.com/open-telemetry/opentelemetry-java-instrumentation/tree/main/instrumentation/runtime-telemetry ? That works perfectly by just registering new instruments using the library, and does not require any agent on the side.
For Java 17: https://github.com/open-telemetry/opentelemetry-java-instrumentation/tree/main/instrumentation/runtime-telemetry/runtime-telemetry-java17/library
Sounds good