datafusion
datafusion copied to clipboard
cannot sum an unnest result
Describe the bug
I cannot sum an unnest result, it reports following error message.
> select unnest(generate_series(1,10));
+---------------------------------------------+
| unnest(generate_series(Int64(1),Int64(10))) |
+---------------------------------------------+
| 1 |
| 2 |
| 3 |
| 4 |
| 5 |
| 6 |
| 7 |
| 8 |
| 9 |
| 10 |
+---------------------------------------------+
10 row(s) fetched.
Elapsed 0.003 seconds.
> select sum(unnest(generate_series(1,10)));
type_coercion
caused by
Internal error: Unnest should be rewritten to LogicalPlan::Unnest before type coercion.
This was likely caused by a bug in DataFusion's code and we would welcome that you file an bug report in our issue tracker
while if use a CTE to save the unnest result, it works
> with t(a) as (select unnest(generate_series(1,10)))select * from t;
+----+
| a |
+----+
| 1 |
| 2 |
| 3 |
| 4 |
| 5 |
| 6 |
| 7 |
| 8 |
| 9 |
| 10 |
+----+
10 row(s) fetched.
Elapsed 0.005 seconds.
> with t(a) as (select unnest(generate_series(1,10)))select sum(a) from t;
+----------+
| SUM(t.a) |
+----------+
| 55 |
+----------+
1 row(s) fetched.
Elapsed 0.014 seconds.
To Reproduce
select sum(unnest(generate_series(1,10)));
Expected behavior
it return a value of 55 as the CTE vesion does
Additional context
No response
@l1t1 I think it is expected result like Duckdb does.
D select sum(unnest([1,2,3]));
Error: Binder Error: UNNEST not supported here
LINE 1: select sum(unnest([1,2,3]));
^
D select unnest([1,2,3]);
┌──────────────────────────────────┐
│ unnest(main.list_value(1, 2, 3)) │
│ int32 │
├──────────────────────────────────┤
│ 1 │
│ 2 │
│ 3 │
└──────────────────────────────────┘
D select sum(a) from (select unnest([1,2,3]) as a);
┌────────┐
│ sum(a) │
│ int128 │
├────────┤
│ 6 │
└────────┘
unless there is other well-known db that support the first case, I think we can keep it as is now.
Another viable replacement query is
DataFusion CLI v37.0.0
❯ select sum(a) from unnest(generate_series(1,10)) as t(a);
+----------+
| SUM(t.a) |
+----------+
| 55 |
+----------+
unless there is other well-known db that support the first case, I think we can keep it as is now.
Maybe we can update the error message with "not supported error" ratehr than internal error