[BUG] Unpredictable format of _explain endpoint
What is the bug?
/_plugins/_sql/_explain returns data in two different format depending sometimes only on the sql query sent.
- Verbose format
- Just request format
How can one reproduce the bug?
-
curl http://opensearch:9200/_plugins/_sql/_explain -H 'Content-Type: application/json' -d '{"query":"SELECT coralogix.timestamp, dateAndTime.value, coralogix.metadata.applicationName, appPoolName FROM `logs`"}' -
And it returns
{
"root": {
"name": "ProjectOperator",
"description": {
"fields": "[coralogix.timestamp, dateAndTime.value, coralogix.metadata.applicationName, appPoolName]"
},
"children": [
{
"name": "OpenSearchIndexScan",
"description": {
"request": "OpenSearchQueryRequest(indexName=logs, sourceBuilder={\"from\":0,\"size\":200,\"timeout\":\"1m\",\"_source\":{\"includes\":[\"appPoolName\",\"coralogix.metadata.applicationName\",\"coralogix.timestamp\",\"dateAndTime.value\"],\"excludes\":[]}}, searchDone=false)"
},
"children": []
}
]
}
}
-
curl http://opensearch:9200/_plugins/_sql/_explain -H 'Content-Type: application/json' -d '{"query":"SELECT coralogix.timestamp, dateAndTime.value, coralogix.metadata.applicationName, appPoolName FROM logs WHERE coralogix.timestamp BETWEEN '\''2022-09-04T09:50:54.251000000'\'' AND '\''2022-09-04T09:55:54.251000000'\''"}' -
Returns
{"from":0,"size":200,"query":{"bool":{"filter":[{"bool":{"must":[{"range":{"coralogix.timestamp":{"from":"2022-09-04T09:50:54.251000000","to":"2022-09-04T09:55:54.251000000","include_lower":true,"include_upper":true,"boost":1.0}}}],"adjust_pure_negative":true,"boost":1.0}}],"adjust_pure_negative":true,"boost":1.0}},"_source":{"includes":["coralogix.timestamp","dateAndTime.value","coralogix.metadata.applicationName","appPoolName"],"excludes":[]}}
- It returns also only ES request when one sends request with
fetch_sizelike {"query": "...", "fetch_size": 10}`
What is the expected behavior? Always return results in single format.
What is your host/environment?
"distribution" : "opensearch",
"number" : "2.1.0",
"build_type" : "tar",
"build_hash" : "388c80ad94529b1d9aad0a735c4740dce2932a32",
"build_date" : "2022-06-30T21:32:59.335435368Z",
"build_snapshot" : false,
"lucene_version" : "9.2.0",
"minimum_wire_compatibility_version" : "7.10.0",
"minimum_index_compatibility_version" : "7.0.0"
https://github.com/opensearch-project/sql/issues/218#issuecomment-949986586
@Yury-Fridlyand I'm reporting that when I change the sql add BETWEEN operator to the query it is returned in the old format. Is it fixed in newer version? I would expect to get the new format always.
It looks we use the newest version {"name":"opensearch-sql","version":"2.1.0.0","opensearch_version":"2.1.0","java_version":"11","description":"OpenSearch SQL","classname":"org.opensearch.sql.plugin.SQLPlugin","custom_foldername":"","extended_plugins":[],"has_native_controller":false}
Or is there some parameter to force it in the new format?
@fokot,
There are two SQL engines into the plugin: old one (legacy) aka V1 and new one aka V2. Some features of V1 are not implemented in V2 yet. We are working on it, and once we finish this, V1 will be deactivated.
V2's _explain has less verbose format, because it does query optimization and post-processing.
When you send a query to the plugin, it tries to execute it on V2, and if it fails, falls back to V1. BETWEEN clause it not implemented in V2 yet, this describes why _explain for a query with BETWEEN looks different - it is executed on V1.
No, there is no parameter to switch between _explain formats. Perhaps, we need to create one.
Thanks for the explanation @Yury-Fridlyand 👍🙂
@fokot Thanks for reporting the issue! We're actively working on the deprecate of legacy engine to avoid such confusion.
Support for BETWEEN in v2 engine is done in https://github.com/opensearch-project/sql/pull/1187. Closing this and will continue tracking this in legacy engine deprecate issue. Thanks!