query using distinct and upper() function gets incorrect result
Bug Report
Please answer these questions before submitting your issue. Thanks!
1. Minimal reproduce step (Required)
CREATE TABLE `t0ae88432` (
`col_80` float NOT NULL DEFAULT '7201.529',
`col_81` datetime NOT NULL DEFAULT '2009-10-29 00:00:00',
PRIMARY KEY (`col_81`) /*T![clustered_index] CLUSTERED */,
UNIQUE KEY `idx_18` (`col_80`,`col_81`),
KEY `idx_19` (`col_80`,`col_81`),
KEY `idx_20` (`col_80`,`col_81`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_general_ci;
INSERT INTO `t0ae88432` VALUES(3475.224,'1975-01-15 00:00:00'),(409.6905,'1976-09-23 00:00:00'),(6751.6357,'1977-02-10 00:00:00'),(3431.0586,'1977-11-09 00:00:00'),(3464.251,'1985-03-05 00:00:00'),(2169.5051,'1985-05-13 00:00:00'),(6177.5654,'1986-04-27 00:00:00'),(1943.7211,'1986-08-25 00:00:00'),(2561.1821,'1987-01-20 00:00:00'),(6293.43,'1987-05-12 00:00:00'),(3431.0586,'1990-01-12 00:00:00'),(62.27955,'1991-08-03 00:00:00'),(9028.825,'1994-03-07 00:00:00'),(867.9517,'1994-04-06 00:00:00'),(1233.6375,'1994-04-13 00:00:00'),(9860.563,'1994-04-17 00:00:00'),(6670.144,'1996-03-18 00:00:00'),(8065.4614,'1997-01-05 00:00:00'),(3632.3801,'1997-10-08 00:00:00'),(1687.8965,'1997-10-12 00:00:00'),(9607.235,'1998-08-20 00:00:00'),(3016.6719,'1998-12-10 00:00:00'),(5192.988,'2000-08-24 00:00:00'),(702.51416,'2001-02-28 00:00:00'),(9208.057,'2003-10-26 00:00:00'),(7094.3926,'2003-12-20 00:00:00'),(3431.0586,'2004-05-18 00:00:00'),(1341.5714,'2005-12-14 00:00:00'),(4650.515,'2008-06-01 00:00:00'),(4594.913,'2009-06-13 00:00:00'),(8106.5312,'2009-08-25 00:00:00'),(3716.5352,'2011-11-04 00:00:00'),(3608.7253,'2012-02-18 00:00:00'),(4877.622,'2012-05-02 00:00:00'),(7177.8843,'2012-08-16 00:00:00'),(5032.0723,'2012-09-05 00:00:00'),(8022.795,'2013-08-27 00:00:00'),(2012.5459,'2013-12-01 00:00:00'),(1181.7646,'2014-03-06 00:00:00'),(1987.5009,'2014-03-07 00:00:00'),(7246.529,'2014-04-15 00:00:00'),(6651.0664,'2014-12-30 00:00:00'),(3636.4275,'2015-11-10 00:00:00'),(8069.625,'2016-04-29 00:00:00'),(9827.414,'2016-08-31 00:00:00'),(1784.0686,'2017-02-06 00:00:00'),(2042.9806,'2017-06-18 00:00:00'),(8133.43,'2018-05-28 00:00:00'),(1454.2546,'2018-10-28 00:00:00'),(9165.09,'2019-04-05 00:00:00'),(1571.7965,'2022-10-04 00:00:00'),(2886.9546,'2023-05-19 00:00:00'),(9009.965,'2023-08-14 00:00:00'),(8252.349,'2024-03-10 00:00:00'),(3431.0586,'2026-05-04 00:00:00'),(7585.7324,'2026-12-11 00:00:00'),(3622.543,'2027-04-16 00:00:00'),(7034.2085,'2030-04-10 00:00:00'),(7214.183,'2033-01-10 00:00:00'),(8007.038,'2033-02-26 00:00:00'),(7033.7773,'2034-01-04 00:00:00'),(7038.205,'2035-07-02 00:00:00'),(2873.2979,'2036-03-27 00:00:00'),(8535.684,'2036-11-04 00:00:00'),(1486.3173,'2036-11-16 00:00:00');
(SELECT DISTINCT UPPER(`t0ae88432`.`col_80`) AS `r0` FROM `t0ae88432`) except (SELECT UPPER(`t0ae88432`.`col_80`) AS `r0` FROM `t0ae88432`);
2. What did you expect to see? (Required)
empty
3. What did you see instead (Required)
mysql> (SELECT DISTINCT UPPER(`t0ae88432`.`col_80`) AS `r0` FROM `t0ae88432`) except (SELECT UPPER(`t0ae88432`.`col_80`) AS `r0` FROM `t0ae88432`);
+-----------+
| r0 |
+-----------+
| 8106.5313 |
+-----------+
1 row in set (0.01 sec)
The original data is 8106.5312, which is different from 8106.5313.
4. What is your TiDB version? (Required)
4e416993cc65b4da6786a7ce859b2ab99e5bba15
(SELECT UPPER(t0ae88432.col_80) AS r0FROMt0ae88432) except (SELECT UPPER(t0ae88432.col_80) AS r0FROMt0ae88432`);
+-----------+
| r0 |
+-----------+
| 8106.5313 |
+-----------+
1 row in set
(SELECT (t0ae88432.col_80) AS r0 FROM t0ae88432) except (SELECT (t0ae88432.col_80) AS r0 FROM t0ae88432);
+----+
| r0 |
+----+
+----+
`
It is the same as #51109, cast(float as char) get wrong result in TiKV. The case can be simplified to
mysql> desc aaa;
+-------+-------+------+------+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+-------+------+------+---------+-------+
| col1 | float | YES | | NULL | |
+-------+-------+------+------+---------+-------+
1 row in set (0.00 sec)
mysql> select * from aaa;
+-----------+
| col1 |
+-----------+
| 8106.5312 |
+-----------+
1 row in set (0.01 sec)
mysql> explain select /*+ agg_to_cop() */ distinct cast(col1 as char(10)) from aaa;
+---------------------------+---------+-----------+---------------+-------------------------------------------------------+
| id | estRows | task | access object | operator info |
+---------------------------+---------+-----------+---------------+-------------------------------------------------------+
| HashAgg_7 | 1.00 | root | | group by:Column#5, funcs:firstrow(Column#5)->Column#3 |
| └─TableReader_8 | 1.00 | root | | data:HashAgg_4 |
| └─HashAgg_4 | 1.00 | cop[tikv] | | group by:cast(test.aaa.col1, var_string(10)), |
| └─TableFullScan_6 | 1.00 | cop[tikv] | table:aaa | keep order:false |
+---------------------------+---------+-----------+---------------+-------------------------------------------------------+
4 rows in set (0.00 sec)
mysql> select /*+ agg_to_cop() */ distinct cast(col1 as char(10)) from aaa;
+------------------------+
| cast(col1 as char(10)) |
+------------------------+
| 8106.5313 |
+------------------------+
1 row in set (0.00 sec)
mysql> explain select distinct cast(col1 as char(10)) from aaa;
+----------------------------+---------+-----------+---------------+----------------------------------------------------------------------------------------------+
| id | estRows | task | access object | operator info |
+----------------------------+---------+-----------+---------------+----------------------------------------------------------------------------------------------+
| HashAgg_6 | 1.00 | root | | group by:Column#8, funcs:firstrow(Column#7)->Column#3 |
| └─Projection_12 | 1.00 | root | | cast(test.aaa.col1, var_string(10))->Column#7, cast(test.aaa.col1, var_string(10))->Column#8 |
| └─TableReader_11 | 1.00 | root | | data:TableFullScan_10 |
| └─TableFullScan_10 | 1.00 | cop[tikv] | table:aaa | keep order:false |
+----------------------------+---------+-----------+---------------+----------------------------------------------------------------------------------------------+
4 rows in set (0.00 sec)
mysql> select distinct cast(col1 as char(10)) from aaa;
+------------------------+
| cast(col1 as char(10)) |
+------------------------+
| 8106.5312 |
+------------------------+
1 row in set (0.00 sec)
Can we lower the severity to major @jebter
Duplicated with https://github.com/pingcap/tidb/issues/51109, close it
/close
@yibin87: Closing this issue.
In response to this:
/close
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.