maha icon indicating copy to clipboard operation
maha copied to clipboard

Aggregate Dim Col Feature

Open pranavbhole opened this issue 6 years ago • 1 comments

Requirement is that we want to query count of section_id that student belongs to. Currently maha does not support the rollup expressions on the dimension columns in the fact.

Example query is:

student_id, count(distinct section_id) as NumberOfSections
from student_perf 
group by student_id

Plan is to create a new derived column called DerAaggregatedDimCol which will have rollup expression along with the derived expression. And respective engine query generator will take care of rendering it. Tricky part is to exclude it from the group by expressions only if base col is not used by any other requested cols or dependent derived cols. As this is experimental feature, i am planning to start it from hive/presto query generator.

let me know if you have any suggestions and questions.

pranavbhole avatar Apr 18 '19 19:04 pranavbhole

Example Generated Query from hive core test

FROM(
SELECT COALESCE(account_id, 0L) advertiser_id, COALESCE(keyword_id, 0L) keyword_id, COALESCE(impressions, 0L) mang_impressions, COALESCE(mang_keyword_count, 0L) mang_keyword_count, COALESCE(mang_keyword_count_scaled, 0L) mang_keyword_count_scaled
FROM(SELECT account_id, keyword_id, (COUNT(distinct keyword_id)) mang_keyword_count, (COUNT(keyword_id * stats_source * 10)) mang_keyword_count_scaled, SUM(impressions) impressions
FROM s_stats_fact
WHERE (account_id = 12345) AND (stats_date >= '2019-04-17' AND stats_date <= '2019-04-24')
GROUP BY account_id, keyword_id

       )
ssf0

)```

pranavbhole avatar Apr 24 '19 21:04 pranavbhole