[Feature] Support some array function
Search before asking
- [X] I had searched in the issues and found no similar issues.
Description
Now Doris has supported the window_funnel function. The window_funnel function can calculate the depth of the ordered window funnel. If only the depth of the funnel is used, it is still difficult to calculate the conversion rate of each layer of funnel. If the following three functions can be supported, the calculation of the conversion rate of each layer of funnel It will be easy to support, and the calculation of retention analysis can also be implemented.
array_with_constant(length, param)
array_enumerate(arr)
array_join(arr)
Use case
array_with_constant(length, param)
description
Syntax
ARRAY<T> array_with_constant(BIGINT length, T param)
Returns an array of the specified length. length: The length of the array. param: filled content.
example
mysql> select k1, k2, array_with_constant(k1, k2) from array_with_constant_demo_table;
+-----------------+-----------------+-----------------------------+
| k1 | k2 | array_with_constant(k1, k2) |
+-----------------+-----------------+-----------------------------+
| 4 | 1 | [1,1,1,1] |
+-----------------+-----------------+-----------------------------+
array_enumerate(arr)
description
Syntax
ARRAY<T> array_enumerate(ARRAY<T> arr)
Returns an array of subscript values of the target array.
example
mysql> select k1, array_enumerate(k1) from array_enumerate_demo_table;
+-----------------+---------------------+
| k1 | array_enumerate(k1) |
+-----------------+---------------------+
| [1, 1, 1, 1] | [1,2,3,4] |
+-----------------+---------------------+
array_join(arr)
description
Syntax
T array_join(ARRAY<T> arr)
Returns the expanded result of an array into multiple lines.
example
mysql> select k1, k2, array_join(k2) from array_join_demo_table;
+-----------------+---------------------+----------------+
| k1 | k2 | array_join(k2) |
+-----------------+---------------------+----------------+
| a1 | [1,2,3,4] | 1 |
| a1 | [1,2,3,4] | 2 |
| a1 | [1,2,3,4] | 3 |
| a1 | [1,2,3,4] | 4 |
+-----------------+---------------------+----------------+
Related issues
No response
Are you willing to submit PR?
- [ ] Yes I am willing to submit a PR!
Code of Conduct
- [X] I agree to follow this project's Code of Conduct
Hi, did you use these array functions before? If yes, what database did you use these functions on before? Hope to be provided more use cases, thanks.
Hi, did you use these array functions before? If yes, what database did you use these functions on before? Hope to be provided more use cases, thanks.
I use the above three functions on ClickHouse.
Below is an example of how we used ClickHouse to calculate conversion rates at each level of an ordered window funnel.
1. The funnel process
login_in -> $pageview -> $end -> $startup -> $end -> $startup
2. Time range 2021-07-01 ~ 2021-07-09
3. Conversion cycle 1 day
CREATE TABLE IF NOT EXISTS test.`event`(
`distinct_id` Nullable(Int64),
`xwhen` Nullable(Int64),
`xwhat` Nullable(String)
) engine = TinyLog;
SELECT level_index,count(1) FROM
(
SELECT distinct_id,
arrayWithConstant(level, 1) levels,
arrayEnumerate( levels ) b, -- for_test_col
arrayJoin(arrayEnumerate( levels )) level_index
FROM (
SELECT
distinct_id,
windowFunnel(86400)(
time,
xwhat = 'login_in',
xwhat = '$pageview',
xwhat = '$end',
xwhat = '$startup',
xwhat = '$end',
xwhat = '$startup'
) AS level
FROM (
SELECT
distinct_id
, xwhat
, toDateTime(toUInt64(round(xwhen/1000))) as time
FROM test.event
WHERE (toDate(time) >= '2021-07-01' AND toDate(time) <= '2021-07-09' AND xwhat = 'login_in')
OR (toDate(time) >= '2021-07-01' AND toDate(time) <= '2021-07-10' AND xwhat IN ('$pageview','$end','$startup'))
)
GROUP BY distinct_id
)
)
GROUP BY level_index
ORDER BY level_index;
@xy720 Hi, what do you think of the above features? Can they be implemented in future versions? Looking forward to your reply, thanks.
@emerkfu
Sorry, we haven't put these functions on our first stage agenda.
Because these functions have some conflicts with the functions we are developing now.
For example: The array_join function is different as our array_join, but more like our explode function.
(See help explode in mysql client)
We will consider support array_with_constant and array_enumerate in the future version.
all this function have support
- array_with_constant function : https://github.com/apache/doris/pull/14115
- array_enumerate function : https://github.com/apache/doris/pull/13612
- array_join function :https://github.com/apache/doris/pull/8766
mysql> select * from arrays_test; +---------+-----------+ | s | arr | +---------+-----------+ | Goodbye | [] | | Hello | [1, 2] | | World | [3, 4, 5] | +---------+-----------+ 3 rows in set (0.01 sec)
select s,explode_array from arrays_test lateral view explode_outer(arr) temp as explode_array; +---------+---------------+ | s | explode_array | +---------+---------------+ | Goodbye | NULL | | Hello | 1 | | Hello | 2 | | World | 3 | | World | 4 | | World | 5 | +---------+---------------+ 6 rows in set (0.01 sec)