FasterTransformer
FasterTransformer copied to clipboard
infer_visiontransformer_op.py error
Branch/Tag/Commit
main
Docker Image Version
nvcr.io/nvidia/pytorch:22.09-py3
GPU name
A-5000
CUDA Driver
12
Reproduced Steps
Error encounter while running the below in the docs:
cd $WORKSPACE/examples/pytorch/vit
pip install ml_collections
##profile of FP16/FP32 model
python infer_visiontransformer_op.py \
--model_type=ViT-B_16 \
--img_size=384 \
--pretrained_dir=./ViT-quantization/ViT-B_16.npz \
--batch-size=32 \
--th-path=$WORKSPACE/build/lib/libth_transformer.so
In running the infer_visiontransformer_op.py file, I was able to load torch.classes.VisionTransformerClass.
However, there is a run time error at FasterTransformer/src/fastertransformer/utils/cublasMMWrapper.cc:306.
This was thrown at "op_tmp = vit.forward(images)" line in the infer_visiontransformer_op.py file.
Below is the line where the error was thrown:
check_cuda_error(cublasGemmEx(cublas_handle_,
transa,
transb,
m,
n,
k,
alpha,
A,
Atype_,
lda,
B,
Btype_,
ldb,
beta,
C,
Ctype_,
ldc,
computeType_,
static_cast<cublasGemmAlgo_t>(cublasAlgo)));
I am not sure what is happening. I followed all the build steps for FasterTransformer with C++; there was no problem.
Actually, I was able to rerun this. I need to set up docker. However, I was wondering why without docker it still posts the above error? Is there any config on my own machine that is different from the docker side? Thanks.
If possible, could you config an easy-to-run non docker version?
@macrocredit Hey, can u please tell, how do u convert pytorch weights to FasterTransformer format(.npz)?