Laikh Tewari

Results 9 comments of Laikh Tewari

@peri044 can you help fill in the instructions for the export flow?

@gs-olive would you ever want to set `construct_live=False` in the compile path? It sounds like this feature reduces device memory pressure between compilation and execution at the cost of added...

Looks like it, I have tensorrt==8.6.1.post1 Shouldn't that be installed automatically as a dependency when I pip install torch_tensorrt?

https://developer.nvidia.com/blog/cuda-pro-tip-the-fast-way-to-query-device-properties/ --- This check was added that likely caused perf issue: https://github.com/pytorch/TensorRT/blob/bf4474dc7816c184489d3985ce892315f5e0cc42/core/runtime/runtime.cpp#L81 This check invokes a constructor for a TensorRT wrapper object RTDevice::RTDevice https://github.com/pytorch/TensorRT/blob/bf4474dc7816c184489d3985ce892315f5e0cc42/core/runtime/RTDevice.cpp#L16 And this is invoking cudaGetDeviceProperties which...

Hi @matichon-vultureprime, we're discussing the best way to manage community highlights -- thanks for the PR and your patience!

Hi @stas00, thank you for raising this issue! TensorRT-LLM doesn't support Llama 3.2 (yet -- coming soon!), though I suspect from the code snippet shared, the question is about Llama...

Oops copied the wrong username, thanks @jinxiangshi !

Where is usage documented? I don't see any docs in the changed files list

Same issue observed running dsv3 example trtllm-bench cmd on **single node** H200 Command: `trtllm-bench --model deepseek-ai/DeepSeek-V3 --model_path /workspace/dsv0324/ throughput --backend pytorch --max_batch_size 2 --max_num_tokens 1160 --dataset /workspace/dataset.txt --tp 8 --ep...