Nathan Price comments

Results 28 comments of


                                            Nathan Price

Conversion of "hf_lora_convert.py" does not account for "lora_alpha"

@byshiue I believe that alpha scaling is expected to be performed on the weights which are uploaded. Digging into the underlying code used by the `examples/run.py` code I found that...

The llava model batch inference result is different with batch=1

I saw similar results with llama3. Mine was resolved when I disabled 'use_custom_all_reduce' in compilation

Added documentation of using warmups to initialize lora weights

Curious to get any feedback here This update is also related to a performance issue I am seeing. https://github.com/NVIDIA/TensorRT-LLM/issues/1957 This PR gets results much closer to the expected outputs but...

Model Performance Degraded when using BFLOAT16 LoRa Adapters

Any updates?! I see a new issue that looks the same as well but in my case I have now tried with the 24.07 tag and the results are the...

Model Performance Degraded when using BFLOAT16 LoRa Adapters

Wondering if there is any progress?

Model Performance Degraded when using BFLOAT16 LoRa Adapters

> In the bug description, I did not see which LoRA was used. could you please tell me ? It's better to offer the huggingface link of the base model...

Model Performance Degraded when using BFLOAT16 LoRa Adapters

Any insights gained from knowing that alpha*A != alpha*B When scaling the weights?

Model Performance Degraded when using BFLOAT16 LoRa Adapters

> [@TheCodeWrangler](https://github.com/TheCodeWrangler) any updates on this? I actually was blocked on this for a deployment I needed. I ended up changing base frameworks to `vllm` in order to move forward...

Model Performance Degraded when using BFLOAT16 LoRa Adapters

I think for reproducing the issue: Get any weights which were trained and apply the alpha value to the A matrix and then retry with applying them to the B...

server.cc:251] failed to enable peer access for some device pairs

Have you tried `nvidia-smi topo -p2p r` To inspect if the drivers for your GPUS are installed and support the peer to peer access? Also I have encounterd similar issues...