Add function to identify field data types
Description
Adds a pipeline function that takes a field name and returns a string representation of the value type for the given field.
Explicit return values are:
boolean
string
double
long
list
map
value_node
array_node
object_node
For non-existent fields the function will return null. Values that cannot be explicitly mapped to the above types will default to their simple java class name in lowercase e.g an object of type java.lang.Integer would return integer if unmapped.
An example rule:
rule "field_value_type"
when
true
then
set_field("boolean", true);
set_field("boolean_type", field_value_type("boolean"));
set_field("string", "hello");
set_field("string_type", field_value_type("string"));
set_field("double", 1.0);
set_field("double_type", field_value_type("double"));
set_field("long", to_long("1L"));
set_field("long_type", field_value_type("long"));
set_field("list", ["one", "two", "three"]);
set_field("list_type", field_value_type("list"));
set_field("map", to_map({foo:"bar"}));
set_field("map_type", field_value_type("map"));
set_field("value_node", parse_json("[\"Ford\", \"BMW\", \"Fiat\"]")[0]);
set_field("value_node_type", field_value_type("value_node"));
set_field("array_node", parse_json("[\"Ford\", \"BMW\", \"Fiat\"]"));
set_field("array_node_type", field_value_type("array_node"));
set_field("object_node", parse_json("{\"foo\": \"bar\"}"));
set_field("object_node_type", field_value_type("object_node"));
end
Would return the message fields:
array_node: ["Ford","BMW","Fiat"]
array_node_type: array_node
boolean: true
boolean_type: boolean
double: 1
double_type: double
field: {"hello":"world"}
list: ["one","two","three"]
list_type: list
long: 0
long_type: long
map: {"foo":"bar"}
map_type: map
object_node: {"foo":"bar"}
object_node_type: object_node
string: hello
string_type: string
value_node: Ford
value_node_type: value_node
Motivation and Context
Closes: https://github.com/Graylog2/graylog-plugin-enterprise/issues/7132
How Has This Been Tested?
Locally in dev and unit tests.
Screenshots (if appropriate):
Types of changes
- [ ] Bug fix (non-breaking change which fixes an issue)
- [X] New feature (non-breaking change which adds functionality)
- [ ] Refactoring (non-breaking change)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
Checklist:
- [X] My code follows the code style of this project.
- [ ] My change requires a change to the documentation.
- [ ] I have updated the documentation accordingly.
- [X] I have read the CONTRIBUTING document.
- [X] I have added tests to cover my changes.
@miwent how does this look for you? Specifically, am I missing any types, being overly specific with any, or do any need renaming?
It's mostly just a catch all for the is_* functions with some JSON type checking as well (Seemed like that might come in handy but is there a use case for those?)
For non-existent fields the function will return an actual null object but would it e better to return the String "null" or something else?
For unmapped types the function tries to infer the java type, but would it be better to return something like "unknown"?
@miwent how does this look for you? Specifically, am I missing any types, being overly specific with any, or do any need renaming?
It's mostly just a catch all for the
is_*functions with some JSON type checking as well (Seemed like that might come in handy but is there a use case for those?)For non-existent fields the function will return an actual
nullobject but would it e better to return the String"null"or something else?For unmapped types the function tries to infer the java type, but would it be better to return something like
"unknown"?
This looks good! The issue with returning a null is that it will generate an error message in the server log when we use it in the when block of a pipeline rule, at least it used to. Returning the string "null" value would prevent this. For the second question the inferred type works.
@miwent how does this look for you? Specifically, am I missing any types, being overly specific with any, or do any need renaming? It's mostly just a catch all for the
is_*functions with some JSON type checking as well (Seemed like that might come in handy but is there a use case for those?) For non-existent fields the function will return an actualnullobject but would it e better to return the String"null"or something else? For unmapped types the function tries to infer the java type, but would it be better to return something like"unknown"?This looks good! The issue with returning a
nullis that it will generate an error message in the server log when we use it in thewhenblock of a pipeline rule, at least it used to. Returning the string "null" value would prevent this. For the second question the inferred type works.
Ok great, will update to return the String "null" then.
Ok updated to return the string "null" for missing fields, and use the cleaner enum structure.
Waiting on resolution in https://github.com/Graylog2/graylog-plugin-enterprise/issues/7132 to finalize and merge this.