Validate user data depth to prevent Elasticsearch issues
Validate user data depth to prevent Elasticsearch issues
Description
If the json data in a User.data field has a depth > 20, Elasticsearch will not index it. This causes issues in a reindex and results in high CPU usage. We should not allow data with a depth > 20 to prevent this. There may be additional improvements to be made in the reindex operation (I suspect it's continuously retrying this operation).
The max depth limit is similar to the max number of fields limit in Elasticsearch. See also
- https://github.com/FusionAuth/fusionauth-issues/issues/2457
Observed versions
Observed in 1.46.0
Affects versions
Steps to reproduce
Expected behavior
Screenshots
If applicable, add screenshots to help explain your problem. Delete this section if it is not applicable.
Platform
Linux and Elasticsearch 7.6.1
Related
- https://github.com/FusionAuth/fusionauth-issues/issues/1640
- https://github.com/FusionAuth/fusionauth-issues/issues/2457
Community guidelines
All issues filed in this repository must abide by the FusionAuth community guidelines.
Additional context
Add any other context about the problem here.
Error from the search logs for context:
java.lang.IllegalArgumentException: Limit of total fields [1000] in index [fusionauth_user] has been exceeded
at org.elasticsearch.index.mapper.MapperService.checkTotalFieldsLimit(MapperService.java:614) ~[elasticsearch-7.6.1.jar:7.6.1]
Will a check for a depth of 20 properly mitigate this? I'm not sure that this is restricted to an issue in the data column and so we may need to validate the whole user object to ensure that it fits, although we will need to mind the performance of any such approach so a depth of 20 might be a reasonable approximation
Will a check for a depth of 20 properly mitigate this? I'm not sure that this is restricted to an issue in the data column and so we may need to validate the whole user object to ensure that it fits, although we will need to mind the performance of any such approach so a depth of 20 might be a reasonable approximation
The log for the depth of 20 being exceeded was
java.lang.IllegalArgumentException: Limit of mapping depth [20] in index [fusionauth_user] has been exceeded due to object field [redacted]
Elasticsearch threw an IllegalArgument for two different (but related) violations
Gotcha. In that case we may want to examine the set of restrictions at https://www.elastic.co/guide/en/elasticsearch/reference/7.17/mapping-settings-limit.html
I'm not sure if there are equivalent settings in opensearch. The closest I could find was this https://opensearch.org/docs/latest/install-and-configure/install-opensearch/index/