Yan Li issues

Repositories
Issues
Comments

Results 3 issues of


                                            Yan Li

building error with gcc 9.3.0 nvcc 11.6.55

cuda compute architecture: sm_75 gcc version: 9.3.0 nvcc version: 11.6.55 ``` /root/repos/RAJA-PERFSUITE/RAJAPerf/src/apps/HALOEXCHANGE_FUSED-OMP.cpp: In member function 'virtual void rajaperf::apps::HALOEXCHANGE_FUSED::runOpenMPVariant(rajaperf::VariantID, size_t)': /root/repos/RAJA-PERFSUITE/RAJAPerf/src/apps/HALOEXCHANGE_FUSED-OMP.cpp:132:326: error: invalid application of 'sizeof' to incomplete type 'pack_lambda_type' {aka...

build

Supporting for expert parallelism in MoE inference

#743 also mentions this issue. So is there a guiding tutorial about how to use expert parallelism in MoE inference?

What's the optimal parallel strategy using TensorRT-LLM?

Thanks for your great efforts first. I read the PR you opened in the TensorRT-LLM repo and noticed that EP +TP, PP + TP, and TP are supported during inference....