Lei
Results
2
issues of
Lei
The model engine is built from llama 3 70b with tensor parallelism tp=2 and pp=2 and deployed by below triton launch script: python3 scripts/launch_triton_server.py --world_size 4 --model_repo=llama_ifb In this case,...
**Is your feature request related to a problem? Please describe.** A clear and concise description of what the problem is. Ex. I'm always frustrated when [...] As per the quick...
Question
New feature