common icon indicating copy to clipboard operation
common copied to clipboard

Update expfmt/text_parse.go to support the new UTF-8 syntax

Open ywwg opened this issue 2 years ago • 13 comments

text_parse is a go implementation of a parser for the plain prometheus text format

ywwg avatar Dec 14 '23 16:12 ywwg

I spoke with @ywwg and I would be willing to help with this one over the holidays, could I get this one assigned to me please?

jmichalek132 avatar Dec 15 '23 11:12 jmichalek132

Notes from the chat:

the basic changes are:

  • current code assumes there must be a metric name before "{". Now, it is ok if a line begins with "{"
  • currently, terms inside {} must have an operator, like = or =~. Now, if there is a quoted term without an operator, that is the * metric name. There can only be one term without an operator.
  • the left hand side of operators inside {} can be quoted
  • for HELP and TYPE and UNIT lines, the metric name may be in quotes.
  • best way to attack this is probably creating the test cases and then fixing the code so they all pass

jmichalek132 avatar Dec 15 '23 11:12 jmichalek132

👍🏽 Thanks!

bwplotka avatar Dec 15 '23 13:12 bwplotka

@jmichalek132 checking in, any progress on this?

ywwg avatar Jan 11 '24 18:01 ywwg

@jmichalek132 checking in, any progress on this?

sorry unfortunately not I was sick over the holidays, but I'll take a stab at this over the weekend.

jmichalek132 avatar Jan 11 '24 19:01 jmichalek132

So finally looking into this today, first thing I am unclear on based on the proposal is what should be the behaviour with the metric name in the prometheus metadata. I assume if the metric name is valid it should keep working as is. If it has utf-8 characters it has to be quoted around in the metadata. Is this assumption correct @ywwg ?

Examples: This should be ok:

# HELP "my.noncompliant.metric" help text
# TYPE "my.noncompliant.metric" counter
{"my.noncompliant.metric", label="value"} 1

This is not ok due to: Escape syntax if the metric has a quote: sum(rate({"my "quoted" metric", region="east"}[5m])) or use single quotes in PromQL (not available in the exposition format): sum(rate({'my "quoted" metric', region="east"}[5m])

# HELP 'my.noncompliant.metric' help text
# TYPE 'my.noncompliant.metric' counter
{'my.noncompliant.metric', label="value"} 1

This is not ok:

# HELP my.noncompliant.metric help text
# TYPE my.noncompliant.metric counter
{"my.noncompliant.metric", label="value"} 1

This is not ok either:

# HELP my.noncompliant.metric help text
# TYPE "my.noncompliant.metric" counter
{"my.noncompliant.metric", label="value"} 1

Also when implemeting all these changes, should I do it in a way that can turn on support for utf-8 with a flag, or just implement the changes directly.

jmichalek132 avatar Jan 15 '24 21:01 jmichalek132

Also when the metric name is surrounded by the double quote I assume we need to validate that if the metric name contains double qoute it's properly escaped right? Otherwise the parsing could break. I.e. {"my "quoted\" metric", label="value"} 1

jmichalek132 avatar Jan 15 '24 23:01 jmichalek132

In the other parsers, I changed them so they can just read the new format without a flag. It was not practical to create switching logic without duplicating all the parsers. Let's start by not using a flag.

I assume if the metric name is valid it should keep working as is. If it has utf-8 characters it has to be quoted around in the metadata.

Yes that's correct.

The current parser supports escaping already, so it should be fine to use that escaping inside double quotes for the TYPE and HELP lines:

# HELP "my.\"noncompliant\".metric" help text
# TYPE "my.\"noncompliant\".metric" counter

ywwg avatar Jan 17 '24 14:01 ywwg

Checking in on this again, is there a PR attached?

ywwg avatar Mar 06 '24 21:03 ywwg

Sorry for the delays I didn't have a chance to make much progress on this, if it's delays are an issue, please do re-assign it to someone else. Otherwise I should be able get time to work on it / publish draft next weekend.

jmichalek132 avatar Mar 10 '24 18:03 jmichalek132

Thanks for letting me know! yeah I think we need to move ahead on this more quickly so I'll be reassigning it. I appreciate your contribution regardless

ywwg avatar Mar 11 '24 16:03 ywwg

@fedetorres93

ywwg avatar Mar 11 '24 17:03 ywwg

I'll take this issue :+1:

fedetorres93 avatar Mar 12 '24 12:03 fedetorres93

done!

ywwg avatar Aug 21 '24 14:08 ywwg