[BUG] SQL query doesn't honor date format in OpenSearch index mapping
What is the bug? It seems SQL query engine doesn't honor what's configured in OpenSearch index mapping for date field. This causes problems in different queries with datetime field involved. See examples below.
How can one reproduce the bug?
As documented, "strict_date_optional_time||epoch_millis" is the default format if not specified in index mapping. The issue happens when custom date format is configured as below. Note that this is mostly due to the gaps between engine v2 and the legacy (which may not have these issues at all).
Issue 1: Datetime literal parsing problem
With epoch_millis format removed in mapping, the previous work query throws exception now. From the error message, it seems caused by epoch timestamp used in DSL translated rather than the only strict_date_optional_time configured. Note that OpenSearch doesn't complain this in any syntax/semantic check (probably due to missing semantic check), but throw exception at execution time instead.
PUT my-index-000002
{
"mappings": {
"properties": {
"date": {
"type": "date",
"format": "strict_date_optional_time"
}
}
}
}
PUT my-index-000002/_doc/3
{ "date": "2015-01-01T12:10:30Z" }
POST _plugins/_sql
{
"query": "SELECT * FROM my-index-000002 WHERE `date` < '2022-08-20 23:59:59.999' "
}
{
"error": {
"type": "SearchPhaseExecutionException",
"reason": "Error occurred in OpenSearch engine: all shards failed",
"details": "Shard[0]: OpenSearchParseException[failed to parse date field [1661039999999] with format [strict_date_optional_time]: [failed to parse date field [1661039999999] with format [strict_date_optional_time]]]; nested: IllegalArgumentException[failed to parse date field [1661039999999] with format [strict_date_optional_time]]; nested: NotSerializableExceptionWrapper[date_time_parse_exception: Text '1661039999999' could not be parsed at index 0];\n\nFor more details, please send request for Json format to see the raw response from OpenSearch engine."
},
"status": 503
}
Issue 2: Datetime value parsing problem
Related: https://github.com/opensearch-project/sql/issues/126, https://github.com/opendistro-for-elasticsearch/sql/issues/1062
No matter what date format configured, OpenSearchExprValueFactory always uses the hardcoding formatter in https://github.com/opensearch-project/sql/blob/b0ef5e0299ccbd48cf5b4bc5f68401d2116aef50/opensearch/src/main/java/org/opensearch/sql/opensearch/data/value/OpenSearchExprValueFactory.java#L86. This causes date value parsing error or wrong timezone.
What is the expected behavior? OpenSearch SQL/PPL should honor the date format in index mapping and parse date value from OpenSearch or date literals in query accordingly.
Currently only data type is returned and associated with field. One approach to improve this is reading datetime format from OpenSearch along with basic field type info. Code: https://github.com/opensearch-project/sql/blob/b0ef5e0299ccbd48cf5b4bc5f68401d2116aef50/opensearch/src/main/java/org/opensearch/sql/opensearch/storage/OpenSearchIndex.java#L60
What is your host/environment?
- OpenSearch 2.2
- Plugins: SQL
Do you have any screenshots? N/A
Do you have any additional context? Similar issues may apply to PPL as well due to single core engine shared across languages.
Isn't a duplicate/related to #126?
Isn't a duplicate/related to #126?
I added it to Issue 2. Issue 1 is slightly different. I think both has same root cause.
Mapping:
{
"mappings" : {
"properties" : {
"key" : {
"type" : "keyword"
},
"val" : {
"type" : "date",
"format": "time_no_millis"
}
}
}
}
Data
{ "index" : { "_id" : "1" } }
{"key": "null", "val": null}
{ "index" : { "_id" : "2" } }
{"key": "001: 00:00:00", "val": "08:00:00Z"}
{ "index" : { "_id" : "3" } }
{"key": "002: 00:00:01", "val": "08:00:01Z"}
{ "index" : { "_id" : "4" } }
{"key": "003: 01:00:00", "val": "09:00:00Z"}
Exception stack:
Error happened during query handling
java.lang.IllegalStateException: Construct ExprTimestampValue from "08:00:00Z" failed, unsupported date format.
at org.opensearch.sql.opensearch.data.value.OpenSearchExprValueFactory.constructTimestamp(OpenSearchExprValueFactory.java:185) ~[opensearch-2.4.0.0-SNAPSHOT.jar:?]
at org.opensearch.sql.opensearch.data.value.OpenSearchExprValueFactory.parseTimestamp(OpenSearchExprValueFactory.java:195) ~[opensearch-2.4.0.0-SNAPSHOT.jar:?]
at org.opensearch.sql.opensearch.data.value.OpenSearchExprValueFactory.parse(OpenSearchExprValueFactory.java:154) ~[opensearch-2.4.0.0-SNAPSHOT.jar:?]
at org.opensearch.sql.opensearch.data.value.OpenSearchExprValueFactory.lambda$parseStruct$16(OpenSearchExprValueFactory.java:204) ~[opensearch-2.4.0.0-SNAPSHOT.jar:?]
at java.util.Iterator.forEachRemaining(Iterator.java:133) ~[?:?]
at org.opensearch.sql.opensearch.data.value.OpenSearchExprValueFactory.parseStruct(OpenSearchExprValueFactory.java:203) ~[opensearch-2.4.0.0-SNAPSHOT.jar:?]
at org.opensearch.sql.opensearch.data.value.OpenSearchExprValueFactory.parse(OpenSearchExprValueFactory.java:149) ~[opensearch-2.4.0.0-SNAPSHOT.jar:?]
at org.opensearch.sql.opensearch.data.value.OpenSearchExprValueFactory.construct(OpenSearchExprValueFactory.java:122) ~[opensearch-2.4.0.0-SNAPSHOT.jar:?]
at org.opensearch.sql.opensearch.response.OpenSearchResponse.lambda$iterator$2(OpenSearchResponse.java:97) ~[opensearch-2.4.0.0-SNAPSHOT.jar:?]
at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195) ~[?:?]
at java.util.Spliterators$ArraySpliterator.tryAdvance(Spliterators.java:958) ~[?:?]
at java.util.stream.StreamSpliterators$WrappingSpliterator.lambda$initPartialTraversalState$0(StreamSpliterators.java:294) ~[?:?]
at java.util.stream.StreamSpliterators$AbstractWrappingSpliterator.fillBuffer(StreamSpliterators.java:206) ~[?:?]
at java.util.stream.StreamSpliterators$AbstractWrappingSpliterator.doAdvance(StreamSpliterators.java:169) ~[?:?]
at java.util.stream.StreamSpliterators$WrappingSpliterator.tryAdvance(StreamSpliterators.java:300) ~[?:?]
at java.util.Spliterators$1Adapter.hasNext(Spliterators.java:681) ~[?:?]
at org.opensearch.sql.opensearch.storage.OpenSearchIndexScan.hasNext(OpenSearchIndexScan.java:90) ~[opensearch-2.4.0.0-SNAPSHOT.jar:?]
at org.opensearch.sql.opensearch.executor.protector.ResourceMonitorPlan.hasNext(ResourceMonitorPlan.java:74) ~[opensearch-2.4.0.0-SNAPSHOT.jar:?]
at org.opensearch.sql.planner.physical.ProjectOperator.hasNext(ProjectOperator.java:51) ~[core-2.4.0.0-SNAPSHOT.jar:?]
at org.opensearch.sql.opensearch.executor.OpenSearchExecutionEngine.lambda$execute$0(OpenSearchExecutionEngine.java:39) [opensearch-2.4.0.0-SNAPSHOT.jar:?]
at org.opensearch.sql.opensearch.client.OpenSearchNodeClient.schedule(OpenSearchNodeClient.java:157) [opensearch-2.4.0.0-SNAPSHOT.jar:?]
at org.opensearch.sql.opensearch.executor.OpenSearchExecutionEngine.execute(OpenSearchExecutionEngine.java:33) [opensearch-2.4.0.0-SNAPSHOT.jar:?]
at org.opensearch.sql.sql.SQLService.execute(SQLService.java:66) [sql-2.4.0.0-SNAPSHOT.jar:?]
at org.opensearch.sql.legacy.plugin.RestSQLQueryAction.lambda$prepareRequest$1(RestSQLQueryAction.java:123) [legacy-2.4.0.0-SNAPSHOT.jar:?]
at org.opensearch.sql.legacy.plugin.RestSqlAction.lambda$prepareRequest$1(RestSqlAction.java:162) [legacy-2.4.0.0-SNAPSHOT.jar:?]
at org.opensearch.sql.opensearch.executor.Scheduler.lambda$withCurrentContext$0(Scheduler.java:30) [opensearch-2.4.0.0-SNAPSHOT.jar:?]
at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:747) [opensearch-2.4.0-SNAPSHOT.jar:2.4.0-SNAPSHOT]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
at java.lang.Thread.run(Thread.java:829) [?:?]
Caused by: java.time.format.DateTimeParseException: Text '08:00:00Z' could not be parsed, unparsed text found at index 8
at java.time.format.DateTimeFormatter.parseResolved0(DateTimeFormatter.java:2049) ~[?:?]
at java.time.format.DateTimeFormatter.parse(DateTimeFormatter.java:1874) ~[?:?]
at org.opensearch.sql.opensearch.data.value.OpenSearchExprValueFactory.constructTimestamp(OpenSearchExprValueFactory.java:182) ~[opensearch-2.4.0.0-SNAPSHOT.jar:?]
... 29 more
Related issue -- https://github.com/opensearch-project/sql/issues/924. See this comment for root cause.
Reported on forum: https://forum.opensearch.org/t/sql-select-fails-on-date-fields-format-epoch-second/11521/
@Yury-Fridlyand @MaxKsyunz Here are some examples with which users had problem previously:
Index mapping:
PUT /_ingest/pipeline/test-pipeline
{
"description" : "test-pipeline",
"processors" : [
{
"date" : {
"output_format" : "strict_date_optional_time",
"ignore_failure" : false,
"field" : "transactionDateTime",
"target_field" : "transactionDateTime",
"formats" : [
"yyyy-MM-dd HH:mm:ss.SSS Z"
]
}
}
]
}
PUT /_index_template/test
{
"index_patterns" : [
"test-*"
],
"template": {
"aliases": {
"test":{}
},
"mappings": {
"numeric_detection": true,
"dynamic_date_formats": [
"strict_date_optional_time",
"yyyy/MM/dd HH:mm:ss Z||yyyy/MM/dd Z||yyyy-MM-dd HH:mm:ss.SSS Z||strict_date_optional_time"
],
"properties": {
"transactionDateTime" : {
"type" : "date"
}
}
}
}
}
Data:
POST /test-123/_doc
{
"transactionDateTime": "2022-11-01 03:00:52.000 +0000"
}
Query:
SELECT * FROM test* where transactionDateTime = '2022-06-21 03:00:52.000'
@Yury-Fridlyand Are we planning to support this in 2.6.0 release once the PoC # 180 done?
@dai-chen, yes
I am having the same issue as the author after updating to 2.9. Can't do any query with dates