l.zonghai
l.zonghai
> @baitian77 @microeastcowboy we will open it next month Hi @bdyx123 MR version will be released this month?
> 1. css push data should use two replicas, so we should start two workers at least > 2. does dir /home/aa/css/logs exist? or dir permission issues? > 3. the...
> Would you try spark.shuffle.rss.replicas=2? Thanks @hiboyang , it works ! Seems replicas=1 actually means the original data itself, no extra replication. replicas=2 is ok.
> Hi @hiboyang , I set `replicas=2` but another exception is thrown: When mapper-A sends data to StreamServer5 and a replication to StreamServer3 * if I kill StreamServer5,mapper-A will use...
Hi @hiboyang I post 2 apps for different exceptions but both failed for StreamServer5 **application-1** failed on job10-Stage14, 26 tasks ( including retry attempts) failed for the same reason, the...
> Thanks @Lobo2008 for the debugging info! I checked the source code again. The [code](https://github.com/uber/RemoteShuffleService/blob/7220c23694e0175e01719621707680a2718173cf/src/main/java/com/uber/rss/clients/ReplicatedWriteClient.java#L145) in RSS is supposed to try another server if hitting error with one server including...
> Yes, if that replicas setting not work for you. > > Another option: you could use `spark.shuffle.rss.excludeHosts` setting to exclude that server with bad disk. Thanks but we may...
> Hi @hiboyang , how about the `1885GB` of `stage-889` ? I suppose when `stage-896` is still running or have some task failed or the whole stage failed and need...
> Hi @Lobo2008, you are right. It could track the stage dependency and clean up stage shuffle files selectively. Need someone to work on this :) Thanks for the reply!...
> RSS cannot use multiple disks so far, since it can only be configured using one directory. Again, this part could be changed as well with contribution welcome. > >...