Provide guidelines for how to structure a processor config
Feature Request
A common task in a processor is to pick a tag, field or measurement, modify the key or the value, and save it over the old value or to a new field/tag. Now that we are beginning to build a library of utility processors of this style, I have noticed that the naming conventions and configuration style is not very uniform between them.
To come up with some guidelines and best practices, I did a survey of the plugins to see if we can take steps to unify their feel. This information when finalized should be included in the contributing documentation or on the wiki.
I primarily considered the following processors, the remaining processors cover other tasks and I don't think they need to be unified to the same degree:
- rename (to be released 1.8)
- enum (to be released 1.8)
- regex (released in 1.7)
- converter (released in 1.7)
- strings (unmerged)
- replace (unmerged)
- math (unmerged)
I identified that we have two main styles for structuring the plugin configuration, I refer to them as tag/field tables and operation tables style below.
Tag/Field table configs have a subtable named tag/tags or field/fields:
[[processors.rename]]
[[processors.rename.tag]]
from = "hostname"
to = "host"
Operation tables have a table named after the operation to perform:
[[processors.strings]]
[[processors.strings.lowercase]]
tag = "method"
The current breakdown between the styles is as such:
- rename (tags/fields)
- enum (tags/fields)
- regex (tags/fields)
- converter (tags/fields)
- strings (operation)
- replace (no subtables)
- drop_strings (no options)
- math (combo)
The two formats can be converted between simply, here is the rename processor in both styles:
[[processors.rename]]
[[processors.rename.tag]]
from = "hostname"
to = "host"
[[processors.rename]]
[[processors.rename.replace]]
tag = "hostname"
dest = "host"
While today we have more plugins using tags/fields, I think the operation style config has a couple advantages:
-
No potential conflict with the
tagsoption, which is not currently available on processors but I think this is only an oversight, and it would be nice to add. Would have a similar conflict with afieldsoption if/when this is added. -
Operation based tables allows for having operations that have different argument sets more easily and are checked for error automatically. This is because you can map each table to a different type in the plugin struct.
Proposal:
Use operation style subtables instead of tag/field/measurement subtables:
Refactor to operation style:
[[processors.rename]]
[[processors.rename.tag]]
from = "hostname"
to = "host"
Perfect:
[[processors.rename]]
[[processors.rename.replace]]
tag = "hostname"
dest = "host"
Naming fields
Several conventions have been suggested:
- from / to
- old / new
- source / dest
- key / result_key
- tag|field|name / result_key
This is basically a matter of taste, so I'm just going to go with what looks best to me:
Select the tag, field, measurement using:
- tag
- tags (list)
- field
- fields (list)
- measurement
If overwrite of the old value is desired, don't include a destination field. To write to a new field or tag use dest.
Examples
Remember that we aren't going to break compatiblity on already released plugins, so regex and converter are just examples oh how I would do it if writing it today.
Rename, enum are already merged for 1.8, I may update these before the release time permitting.
## already released; no change planned
[[processors.regex]]
[[processors.regex.repl]]
tag = "resp_code"
pattern = "^(\\d)\\d\\d$"
replacement = "${1}xx"
## already released; no change planned
[[processors.converter]]
[[processors.converter.convert]]
field = "resp_code"
type = "integer"
[[processors.rename]]
[[processors.rename.replace]]
measurement = "network_interface_throughput"
dest = "throughput"
[[processors.rename.replace]]
tag = "hostname"
dest = "host"
[[processors.rename.replace]]
field = "lower"
dest = "min"
[[processors.enum]]
[[processors.enum.map]]
field = "name"
dest = "mapped"
default = 0
[processors.enum.map.value_mappings]
value1 = 1
value2 = 2
[[processor.math]]
[[processor.math.unary]]
function = "abs"
fields = ["io_time", "read_time", "write_time"]
[[processors.strings]]
[[processors.strings.lowercase]]
tag = "method"
[[processors.strings.uppercase]]
field = "cs-host"
[[processors.strings.trimleft]]
field = "cs-host"
cutset = "cs-"
dest = "trimmed"
[[processors.strings.replace]]
measurement = "*"
old = ":"
new = "_"
We should consider how a processor could support operations on tagkeys and fieldkeys, instead of only the values. It would be nice to be able to use strings replace on a fieldkey for example: https://github.com/influxdata/telegraf/issues/5173
We would like to put some documentation around how to structure a processor config. If anyone has anything they would like highlighted please share.