Fei Hu comments

Results 9 comments of


                                            Fei Hu

Improve API input format

@MLnick The input size for the three models is: 1) `?*24*12` for univariate model; 2) `?*24*12` for multivariate_model; 3) `?*48*12` for multistep model, where `12` is the number of input...

GPT2 + FP8 example does not work

> Can you try the docker image recommended in the document? Thanks @byshiue for your quick reply! Do you mean this docker image `nvcr.io/nvidia/pytorch:22.09-py3`?

GPT2 + FP8 example does not work

Just tried `nvcr.io/nvidia/pytorch:22.09-py3`, but still got the same error message.

GPT2 + FP8 example does not work

I also ran the below commands to tune gemm, but fp8 is multiple times slower than fp16 in 8 of 11 cases (please check the last column (`speedup`) in the...

GPT2 + FP8 example does not work

Got the same error in H100 and H100-MIG as below: ``` [FT][ERROR] CUDA runtime error: an illegal memory access was encountered FasterTransformer/src/fastertransformer/models/gpt_fp8/GptFP8ContextDecoder.cc:243 ```

GPT2 + FP8 example does not work

> Can you post your scripts and full log? Hi @byshiue, I create new docker containers to test it again. For `nvcr.io/nvidia/pytorch:22.09-py3`, I confirm it works well now (not quite...

GPT2 + FP8 example does not work

> For performance between fp16 and fp8, fp8 only brings speedup when the batch size is large enough. But the batch size in the example is only 1. I made...

torch 1.4 build error

@ananthdurbha I met the same issue with yours. I'm wondering if you have resolved it?

torch 1.4 build error

Solve this issue by using clang++ and build pytorch from source. There may be other better solutions.