YutingWang98

Results 9 comments of YutingWang98

@mayurdb Hi mayurdb! We also have this server down/restart issue quite frequently. Do you mind sharing your progress on the stage retry and new server list picking, or how you...

@hiboyang Hi, I fould the bug and fixed it in a pull request

Thank you for the suggestions @hiboyang ! Does this mean the shuffle data written to the server will be doubled if I set 'spark.shuffle.rss.replicas' to 2? If so, this will...

Hi, @hiboyang. If the 'spark.shuffle.rss.replicas' does write double size of data to server, we won't be able to use this to large jobs with 400+ TB shuffle data unfortunatly. So...

Thanks for the replay! Will see what I can do to improve this.

@hiboyang Hi! I attempted to contribute to adding stage retry, but there seems to be a difficulty due to the implementation of Rss. Wondering if I can have some insights...

Hi @mayurdb, thank you for the reply, and sharing your implementation! I have a question here: If the spark stages are cascading, then one stage may depend on the previous...

> @mayurdb Thank you for sharing it, will take a look!

Hi @mayurdb, we have also been experiencing memory and map stage latency issues using Rss. We plan to test and work on this implementation as well. Wondering if you have...