Yuxin An comments

Results 10 comments of


                                            Yuxin An

Performance bottleneck with Constant Propagation Pass.

> @zincnode constant propagation is to do computation at compile time for operations whose inputs are constant. So for big inputs , it would take time because constant propagation is...

[RFC] Add support to tag tensors that will be converted to custom quantized numerics

> **Motivation:** Hardware vendors use custom numeric format(s) that is pertinent to their own hardware as one of the ways to obtain stand-out performance. When lowering a model from the...

Issue generating heavydep tests

Hi, @gpetters94 , @silvasean I also encountered the problem of **"attribute lookup is not defined on builtin"**, is there any solution or idea? I'm trying to export the forward and...

Encoding or storage mode of torch.tensor.literal

> 它存储为 DenseElementsAttr：[https ://mlir.llvm.org/doxygen/classmlir_1_1DenseElementsAttr.html](https://mlir.llvm.org/doxygen/classmlir_1_1DenseElementsAttr.html) > > 您可以在此处查看语法：[https ://mlir.llvm.org/docs/Dialects/Builtin/#denseintorfpelementsattr](https://mlir.llvm.org/docs/Dialects/Builtin/#denseintorfpelementsattr) OK, thanks for your reply. I will check the docs.

NCCL INFO Launch mode Parallel hangs during training

Hi, I also encountered this problem. Do you have any solution?

[Question] How to launch jobs with Docker env using multiple nodes in DeepSpeed?

Hi, @bing0037 I did something similar before (testing ZeRO on a multi-node docker env). Here are some thoughts or suggestions: 0. The server (outside the container, hereinafter referred to as:...

[BUG] Get "exits with return code = -9" when Creating fp16 ZeRO stage 2 optimizer

Same error. I use 4 * 8 A100(80GB) to train the GPT-2 model of 100B. Enable ZeRO-3 in the training script. I ran into this problem #2185 first. Then I...

[BUG] Get "exits with return code = -9" when Creating fp16 ZeRO stage 2 optimizer

I think it probably stands for CPU OOM, as mentioned #2788 . I use the same environment and the same configuration, and only change the model size (100B->50B), there is...

Ubuntu 21.04: to bridge docker0 failed: could not find bridge docker0: route ip+net: no such network interface.

> Not sure if it will help, had a similar issue and found that it was related to the following package 'netscript-2.4' I installed it to make dealing with network...

[Feature Request] Add additional debug information for `Triton Error [CUDA]: device kernel image is invalid`

> I faced the same error in the cuda 11.x environment and could resolve it with `pip install triton==2.1.0`. it's easier than the source build so I think it's a...