Option to set default fields for generic parsers
Let's say I have a generic csv parser with following parameters:
"parameters": {
"columns": [
"source.url",
"source.fqdn",
"source.ip",
"time.source",
"__IGNORE__",
"__IGNORE__"
],
"delimiter": ",",
"skip_header": true,
"type": "phishing"
}
I would like to be able to set any other fields (including extra.somekey) to a fixed value. For instance:
protocol.transport = tcp
protocol.application = http
source.port = 80
Setting a fixed value to a field that is assigned in the columns array could also server as default value in case when a particular column of a row is empty.
This is currently not possible with generic parsers and a custom parser would have to be implemented. Only the type parameter can be set to have proper classification. I would like to request a feature to have this option.
Proposed realization:
- add
default_fieldsas subkey ofparameters - rename
typetoclassification.typeand move todefault_fieldsfor consistency - allow any other field defined in Harmonization Fields (with the exception of
raw,time.sourceandtime.observationfields) to be set to a fixed value (serving also as a default value) in thedefault_fieldsof bot configuration - allow fields starting with
extra.to be set to a fixed value in thedefault_fieldsof bot configuration
Example of proposed realization:
"parameters": {
"columns": [
"source.url",
"source.fqdn",
"source.ip",
"time.source",
"__IGNORE__",
"__IGNORE__"
],
"delimiter": ",",
"skip_header": true,
"default_fields": {
"classification.type": "phishing",
"protocol.transport": "tcp",
"protocol.application": "http",
"source.port": 80,
"extra.different_system_tag": "web-phishing"
}
}
I can work on this feature, but I would like to know first if it would be accepted.
Thanks for proposing and accurately describe the idea! I very much like it and am looking forward to your PR =)
This is currently not possible with generic parsers and a custom parser would have to be implemented.
Or an additional expert (modify, sieve) is used.
allow any other field defined in Harmonization Fields (with the exception of raw, time.source and time.observation fields)
I wouldn't handle them differently. There could be a use-case for that and implementing the restriction would only restrict the possibilities for the user. (Also, the list of disallowed fields would be kind of arbitrary)
Please assign this issue to me, I will work on it when I have time.