Subhalingam D

Results 2 issues of Subhalingam D

Hello, Could you provide guidance on implementing model and tensor parallelism in a deployment setting, such as with NVIDIA Triton Inference Server? While it worked when running in script mode,...

Hi, I was running Flan-t5 XXL with ctranslate2 and observed completely different results when run with tensor parallelism. **To convert from HF to CT2:** ```bash ct2-transformers-converter --model google/flan-t5-xxl --output_dir flan-t5-xxl...