Peter

Results 4 comments of Peter

we encounter the same issue: build engine with tensorrt_llm v0.9.0 and tp_size=4 ``` Traceback (most recent call last): File "/app/tensorrt_llm/examples/run.py", line 564, in main(args) File "/app/tensorrt_llm/examples/run.py", line 413, in main...

> we encounter the same issue: build engine with tensorrt_llm v0.9.0 and tp_size=4 > > ``` > Traceback (most recent call last): > File "/app/tensorrt_llm/examples/run.py", line 564, in > main(args)...

we met the same issue using V100 for v0.7.1 (no problem using A100): [2024-02-04 18:22:28] [02/04/2024-10:22:28] [TRT] [E] 4: Internal error: plugin node LLaMAForCausalLM/layers/0/attention/PLUGIN_V2_GPTAttention_0 requires 224716128768 bytes of scratch space,...

we met the same issue with mistral7b: TP = 4 GPU = 4 * A10 vllm = 0.2.7