Shannon Phu comments

Repositories
Issues
Comments

Results 6 comments of


                                            Shannon Phu

[Documentation] Unclear how to use other architectures

Did anyone figure this out on how to use other architectures?

enc-dec triton backend support

@symphonylyh (1) and/or (3). I am not super clear on the difference between the Python vs C++ backend. I was using this to build the engine https://github.com/NVIDIA/TensorRT-LLM/blob/main/examples/enc_dec/README.md

enc-dec triton backend support

@mlmonk Oh interesting, I was under the impression that we just couldn't serve T5 models on Triton yet because the TRT-LLM backend wasn't ready for it yet.

Include stdio.h

this helped fix the build for me

libth_transformer.so: cannot open shared object file: No such file or directory

I'm running into this issue too. What are the build and make commands you are running?

GPT2 FP8 don't work on L4/L40 platform

@champson were you able to get FasterTransformer working on an L4 GPU?