learning-spark
learning-spark copied to clipboard
Basic ideas to solve Spark OOM: Count all the high frequence words in a big table
The detail question is:
I want to count all the high frequence words in a big table.
I split each sentence of each row, then flatmap to one word per row, then groupby, then count the word number in each group.
It will OOM.