Add syntax to be able to reduce over multiple windows
I wonder if it's possible to have some additional syntax to make working with many windows possible?
For example, with ClickHouse SQL it's possible to work across many windows like this:
WITH sales AS (
-- Your dataset source here
)
SELECT
date,
salesperson_id,
region,
amount,
product_id,
ROW_NUMBER() OVER w_region_amount AS row_number_region,
RANK() OVER w_salesperson_amount AS rank_salesperson,
DENSE_RANK() OVER w_product_amount AS dense_rank_product,
SUM(amount) OVER w_region AS sum_sales_region,
AVG(amount) OVER w_region AS avg_sales_region,
MAX(amount) OVER w_salesperson AS max_sales_salesperson,
MIN(amount) OVER w_product AS min_sales_product,
LEAD(amount, 1) OVER w_salesperson_date AS lead_amount,
LAG(amount, 1) OVER w_salesperson_date AS lag_amount,
NTILE(10) OVER w_global_amount AS decile_rank_by_amount,
FIRST_VALUE(salesperson_id) OVER w_region_amount AS top_salesperson_region,
LAST_VALUE(salesperson_id) OVER w_region_amount_rows AS last_salesperson_region,
COUNT(*) OVER w_region AS count_sales_region,
PERCENT_RANK() OVER w_region_amount AS percent_rank_region,
CUME_DIST() OVER w_region_amount AS cume_dist_region
FROM sales
WINDOW
w_region AS (PARTITION BY region),
w_salesperson AS (PARTITION BY salesperson_id),
w_product AS (PARTITION BY product_id),
w_region_amount AS (PARTITION BY region ORDER BY amount DESC),
w_salesperson_amount AS (PARTITION BY salesperson_id ORDER BY amount DESC),
w_product_amount AS (PARTITION BY product_id ORDER BY amount DESC),
w_salesperson_date AS (PARTITION BY salesperson_id ORDER BY date),
w_global_amount AS (ORDER BY amount DESC),
w_region_amount_rows AS (PARTITION BY region ORDER BY amount DESC ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)
ORDER BY region, amount DESC;
With Pathway, I found myself having to define each window as a separate table, and then joining them back together. The resulting code was quite verbose.
Unless I am missing something and it's possible to do it succinctly with Pathway?
Defining each window as a separate table (with groupby) is probably a good idea. As for joining it all back, it is convenient to use ix_ref - here is an example in the SQL-window spirit to compare the salary of an employee to the average salary in their position & department: https://pathway.com/developers/user-guide/data-transformation/indexing-grouped-tables/#multi-values-indexing.
Calling in @izulin - maybe you know of something cleaner. Either way, it would help to explain that this is the recommended way to implement SQL-windows (not to be confused with streaming windows).
Thanks a lot @dxtrous, will give it a try!
With Pathway, I found myself having to define each window as a separate table, and then joining them back together. The resulting code was quite verbose.
Unless I am missing something and it's possible to do it succinctly with Pathway?
Hi Ilya,
could you post an example of a verbose/unwieldy pathway code here? This could be a great starting point on how to extend syntax and/or tutorials.