cortex icon indicating copy to clipboard operation
cortex copied to clipboard

OTLP native support

Open songjiayang opened this issue 3 years ago • 6 comments

Is your feature request related to a problem? Please describe.

Currently, Cortex only supports Prometheus remote write protocol, with OTel is becoming more and more popular, more apps use OTel instrument SDK to export metrics but we can't push OTLP metrics to Cortex directly.

Describe the solution you'd like

Adding OTLPHttp exporter native support.

Describe alternatives you've considered

Although we can use OTel Collector to transform OTLP to Prometheus remote wirte protocol, which means we need extra maintenance an collector service.

songjiayang avatar Nov 22 '22 03:11 songjiayang

I think this makes lots of sense...

I just dont know if the OTLP data model can be translated to prometheus data model with no lost (histograms?)..

alanprot avatar Nov 22 '22 18:11 alanprot

I just don't know if the OTLP data model can be translated to prometheus data model with no lost (histograms?)..

we can try to follow https://opentelemetry.io/docs/reference/specification/metrics/data-model/#otlp-metric-points-to-prometheus

There are Histograms and ExponentialHistograms. The latter is apparently a no go for prometheus.

But I think for experimental support we just do what's easy (counters, gauges, etc). Nobody is expecting full compatibility in our initial support.

friedrichg avatar Nov 23 '22 20:11 friedrichg

Based on Prometheus dev summit notes https://docs.google.com/document/d/11LC3wJcVk00l8w5P3oLQ-m3Y37iom6INAMEu2ZAGIIE/edit, Prometheus will support this natively soon and we can just reuse the handler. I recommend us to wait for upstream to not reinvent weels ourselves.

Goutham: Reconsider OTLP Ingest
The OpenTelemetry format is gaining adoption and I think we can make a lot of things simpler by adding an ingest layer similar to remote_write ingest to Prometheus.
Safeguards discussion
G: Behind flag not feature flag long term
Chris: gRPC is needed?
G: Yes, but not at first

CONSENSUS: We will implement this behind a feature flag according to the upstream OpenTelemetry Spec on OTLP → Prometheus datamodel mapping. We will have all appropriate safeguards in place before making it a first-class flag.

yeya24 avatar Nov 23 '22 20:11 yeya24

FYI for the upstream OTLP PR https://github.com/prometheus/prometheus/pull/11965.

yeya24 avatar Feb 10 '23 21:02 yeya24

This issue has been automatically marked as stale because it has not had any activity in the past 60 days. It will be closed in 15 days if no further activity occurs. Thank you for your contributions.

stale[bot] avatar May 21 '23 18:05 stale[bot]

https://github.com/prometheus/prometheus/pull/12571 Got merged and we can start integrating it.

I think Cortex cannot just reuse the handler as it is. We need to add some other code to ingest/convert otlp metrics to Cortex request format

yeya24 avatar Jul 28 '23 20:07 yeya24

Hi @friedrichg, just wondering if you're still working on this. I'd like to help out if possible.

CharlieTLe avatar Mar 14 '24 00:03 CharlieTLe

hey @CharlieTLe , yes I am working on it. The function that I was using is about to change https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/31385

Gotta make sure the change is in line with what we need.

friedrichg avatar Mar 14 '24 18:03 friedrichg