Haisong Ding comments

Results 14 comments of


                                            Haisong Ding

Context Length Increse Results in OOM

It would be great to include the **context length** info in vRAM reportings at README for reference.

Fix typo in CONTRIBUTING.md

Any plans to release the dataset?

FP16 failure of TensorRT 8.6.1.6 when running GroundingDINO on GPU GeForce RTX 3080 Ti

I also come across the same problem in 3090 GPUs, this has been bugging me for days.

FP16 failure of TensorRT 8.6.1.6 when running GroundingDINO on GPU GeForce RTX 3080 Ti

I tried setting the backbone to use FP16, the encoder-decoder part to use FP32. The results roughly match with the FP32 engine. But it is not as fast as the...

FP16 failure of TensorRT 8.6.1.6 when running GroundingDINO on GPU GeForce RTX 3080 Ti

> How can you set different precision in different part when creating engine? Can you show me a code example? @HaisongDing For example, in the [detectron2 example](https://github.com/NVIDIA/TensorRT/blob/5f422623e7f5bdc593b781695cbddda99124c9b8/samples/python/detectron2/build_engine.py#L169). Adding something like...

FP16 failure of TensorRT 8.6.1.6 when running GroundingDINO on GPU GeForce RTX 3080 Ti

@monsterlyg Update torch to >=1.13.1 to use opset 17 when exporting to onnx. Update tensorrt 8.6.1 to use INormalization layers.

FP16 failure of TensorRT 8.6.1.6 when running GroundingDINO on GPU GeForce RTX 3080 Ti

> I tried setting the backbone to use FP16, the encoder-decoder part to use FP32. The results roughly match with the FP32 engine. But it is not as fast as...

FP16 failure of TensorRT 8.6.1.6 when running GroundingDINO on GPU GeForce RTX 3080 Ti

I only converted a customized Grounding-DINO model. Also the BERT part is pre-computed in my setups. So only the backbone and encoder-decoder are converted to Tensor-RT. On my customized dataset,...

Llama3 ChatFormat?

> [](url) > @Broyojo This is a great question. There is no required "chat format" in the same sense as LLaMA2 where you needed to format your prompt with instruct...

[Feature Request] llama v3 support

Can any one post the throughput of trt llama v3 models on popular GPUs. Many thanks.