matrixone icon indicating copy to clipboard operation
matrixone copied to clipboard

[Bug]: The result of sum(sin(column)) sum(tan(column)) and sum(cos(column)) is incorrent.

Open Ariznawlll opened this issue 1 year ago • 3 comments

Is there an existing issue for the same bug?

  • [X] I have checked the existing issues.

Branch Name

main

Commit ID

ead69b441080825781ba62c0bc886f4f4b9f9ac5

Other Environment Information

- Hardware parameters:
- OS type:
- Others:

Actual Behavior

select sum(sin(col9)) from big_data_test.table_basic_for_load_100M: image image

select sum(cos(col9)) from big_data_test.table_basic_for_load_100M; image image

select sum(tan(col9)) from big_data_test.table_basic_for_load_100M image image

Expected Behavior

No response

Steps to Reproduce

1. create table:
create table if not exists big_data_test.table_basic_for_load_100M(
col1 tinyint,
col2 smallint,
col3 int,
col4 bigint,
col5 tinyint unsigned,
col6 smallint unsigned,
col7 int unsigned,
col8 bigint unsigned,
col9 float,
col10 double,
col11 varchar(255),
col12 Date,
col13 DateTime,
col14 timestamp,
col15 bool,
col16 decimal(16,6),
col17 text,
col18 json,
col19 blob,
col20 binary(255),
col21 varbinary(255),
col22 vecf32(3),
col23 vecf32(3),
col24 vecf64(3),
col25 vecf64(3)
);
2. load data into table:
load data url s3option {'endpoint'='http://cos.ap-guangzhou.myqcloud.com','access_key_id'='***','secret_access_key'='***','bucket'='mo-load-guangzhou-1308875761', 'filepath'='mo-big-data/100000000_20_columns_load_data_new.csv'} into table big_data_test.table_basic_for_load_100M fields terminated by '|' lines terminated by '\n' parallel 'true';
3. execute sql:
select sum(tan(col9)) from big_data_test.table_basic_for_load_100M;
select sum(sin(col9)) from big_data_test.table_basic_for_load_100M;
select sum(cos(col9)) from big_data_test.table_basic_for_load_100M;

Additional information

No response

Ariznawlll avatar Feb 26 '24 03:02 Ariznawlll

The result of sum(log(column)) is incorrect. doris result:

select sum(log(col3)) from big_data_test.table_basic_for_load_1B where col3 > 0
"sum(log(cast(2.71828 as DOUBLE), cast(col3 as DOUBLE)))"
10243454620.778414
 select sum(log(col9)) from big_data_test.table_basic_for_load_1B where col9 > 0
"sum(log(cast(2.71828 as DOUBLE), cast(col9 as DOUBLE)))"
8710096011.216784
 select sum(log(col10)) from big_data_test.table_basic_for_load_1B where col10 > 0
"sum(log(cast(2.71828 as DOUBLE), col10))"
8710517813.01789
 select sum(log(col16)) from big_data_test.table_basic_for_load_1B where col16 > 0
"sum(log(cast(2.71828 as DOUBLE), cast(col16 as DOUBLE)))"
8709651145.299595

mo result: image

10y数据量: create table if not exists big_data_test.table_basic_for_load_1B( col1 tinyint, col2 smallint, col3 int, col4 bigint, col5 tinyint unsigned, col6 smallint unsigned, col7 int unsigned, col8 bigint unsigned, col9 float, col10 double, col11 varchar(255), col12 Date, col13 DateTime, col14 timestamp, col15 bool, col16 decimal(16,6), col17 text, col18 json, col19 blob, col20 binary(255), col21 varbinary(255), col22 vecf32(3), col23 vecf32(3), col24 vecf64(3), col25 vecf64(3) );

load data url s3option {'endpoint'='http://cos.ap-guangzhou.myqcloud.com/','access_key_id'='','secret_access_key'='','bucket'='mo-load-guangzhou-1308875761', 'filepath'='mo-big-data/1000000000_20_columns_load_data_new.csv'} into table big_data_test.table_basic_for_load_1B fields terminated by '|' lines terminated by '\n' parallel 'true';

Ariznawlll avatar Feb 26 '24 09:02 Ariznawlll

https://github.com/matrixorigin/matrixone/issues/14675#issuecomment-1965998151 Golang对float的四舍五入导致了实际值和输入原值之间最多有5.0的差值,这会使得三角函数的结果会变成其值域内的任意值

zengyan1 avatar Feb 29 '24 10:02 zengyan1

该issue是float32精度不够导致的,可关闭

zengyan1 avatar Mar 05 '24 10:03 zengyan1