StreamBench icon indicating copy to clipboard operation
StreamBench copied to clipboard

Flink window and Spark window

Open 0x7aF777 opened this issue 10 years ago • 2 comments

Flink need do group operation first before window. Spark doesn't need. What's more, spark should avoid groupByKey. https://databricks.gitbooks.io/databricks-spark-knowledge-base/content/best_practices/prefer_reducebykey_over_groupbykey.html

0x7aF777 avatar Nov 03 '15 22:11 0x7aF777

Avoid to use group in the how bench system, try to use reduceByKey, windowReduceByKey

0x7aF777 avatar Nov 03 '15 23:11 0x7aF777

Window on a non-grouped stream, spark has windows on each node. Flink has only one global window in one single node.

0x7aF777 avatar Nov 04 '15 17:11 0x7aF777