Steven Chen comments

Results 5 comments of


                                            Steven Chen

MPS device appears much slower than CPU on M1 Mac Pro

Same here for LSTM examples. I used Torch Profiler to see the performance difference: CPU: MPS:

MPS device appears much slower than CPU on M1 Mac Pro

> The spike in microsecond-level overhead (CPU time avg) was discussed [here](https://github.com/pytorch/pytorch/issues/82707#issuecomment-1204672455). I think I’ve found a solution to it, but haven’t put it into practice with an RNN. Any...

[💡SUG] Will multi-GPU be supported in the next version?

Any updates on this?

[BUG] Invalidate trace cache @ step 1: expected module 25, but got module 323,how to resolve it ?

Same issue, my deepspeed config is: ``` { "fp16": { "enabled": "auto", "loss_scale": 0, "loss_scale_window": 1000, "initial_scale_power": 16, "hysteresis": 2, "min_loss_scale": 1 }, "scheduler": { "type": "WarmupLR", "params": { "warmup_min_lr":...

[BUG] Error `raise RuntimeError(f"still have inflight params "` when doing IDEFICS inference

遇到同样的问题，deepspeed是0.14.2