intelmq icon indicating copy to clipboard operation
intelmq copied to clipboard

Option to set default fields for generic parsers

Open gethvi opened this issue 4 years ago • 2 comments

Let's say I have a generic csv parser with following parameters:

"parameters": {
  "columns": [
    "source.url",
    "source.fqdn",
    "source.ip",
    "time.source",
    "__IGNORE__",
    "__IGNORE__"
  ],
  "delimiter": ",",
  "skip_header": true,
  "type": "phishing"
}

I would like to be able to set any other fields (including extra.somekey) to a fixed value. For instance:

protocol.transport = tcp
protocol.application = http
source.port = 80

Setting a fixed value to a field that is assigned in the columns array could also server as default value in case when a particular column of a row is empty.

This is currently not possible with generic parsers and a custom parser would have to be implemented. Only the type parameter can be set to have proper classification. I would like to request a feature to have this option.

Proposed realization:

  • add default_fields as subkey of parameters
  • rename type to classification.type and move to default_fields for consistency
  • allow any other field defined in Harmonization Fields (with the exception of raw, time.source and time.observation fields) to be set to a fixed value (serving also as a default value) in the default_fields of bot configuration
  • allow fields starting with extra. to be set to a fixed value in the default_fields of bot configuration

Example of proposed realization:

"parameters": {
  "columns": [
    "source.url",
    "source.fqdn",
    "source.ip",
    "time.source",
    "__IGNORE__",
    "__IGNORE__"
  ],
  "delimiter": ",",
  "skip_header": true,
  "default_fields": {
    "classification.type": "phishing",
    "protocol.transport": "tcp",
    "protocol.application": "http",
    "source.port": 80,
    "extra.different_system_tag": "web-phishing"
  }
}

I can work on this feature, but I would like to know first if it would be accepted.

gethvi avatar Mar 01 '21 18:03 gethvi

Thanks for proposing and accurately describe the idea! I very much like it and am looking forward to your PR =)

This is currently not possible with generic parsers and a custom parser would have to be implemented.

Or an additional expert (modify, sieve) is used.

allow any other field defined in Harmonization Fields (with the exception of raw, time.source and time.observation fields)

I wouldn't handle them differently. There could be a use-case for that and implementing the restriction would only restrict the possibilities for the user. (Also, the list of disallowed fields would be kind of arbitrary)

ghost avatar Mar 02 '21 07:03 ghost

Please assign this issue to me, I will work on it when I have time.

gethvi avatar Oct 26 '21 11:10 gethvi