Ryan issues

Results 10 issues of


                                            Ryan

Failed testing Result for matchboxnet and vad_marblenet

Hi Dusty Thanks for the great repo and tutorial of introducing the ASR and NLP on Jetson device. I followed the instructions and got 2 error for 2 model s...

MLLM Support

Hi First of all, thanks for your amazing work here. I’m wondering whether CrewAI would support any MLLM (multimodal large language model)? Since it’s more suitable in my use case....

feature-accepted

The mathematical trick in self-attention, why it returns false for torch.allclose(xbow, xbow2)?

Hi I noticed that the result of `torch.allclose(xbow, xbow2), torch.allclose(xbow, xbow3)` are all `false `when running the Collab example `gpt-dev.ipynb` in **_The mathematical trick in self-attention_** section. Here is what...

Mesh extraction issue with outdoor scene

Hi After successfully visualize the Lego example with great mesh look, I decided to try outdoor scene (SCENE_TYPE = outdoor) with more images. When running mesh extraction command, I encountered...

How to deploy Triton Inference Server Container (tritonserver:24.04-trtllm-python-py3) in K8S without launching Triton Server directly?

Hi I'm trying to deploy Triton Inference Server with tensorrtllm_backend to K8S by Helm Chart by following this doc [HERE](https://github.com/triton-inference-server/server/blob/main/deploy/k8s-onprem/README.md) and [HERE](https://github.com/triton-inference-server/tutorials/blob/main/Popular_Models_Guide/Llama2/trtllm_guide.md) I notice the changes on recent releases as...

[k8s-on-prem] Timeout issue with Traefik deployment replicas more than 1

**Description** I'm trying to use Triton with K8S om-prem by following [this repo](https://github.com/triton-inference-server/server/blob/main/deploy/k8s-onprem/README.md) Here is my setup - 3 EC2s on AWS, 1 master and 2 worker nodes. - CNI:...

LLAMA 3.1 8B Quantization failed from BF16 to FP8

### System Info GPU: NVIDIA T4 * 4 Driver Version: 550.54.15 CUDA: 12.4 Image: nvcr.io/nvidia/tritonserver:24.07-trtllm-python-py3 TensorRT-LLM version: 0.11.0 ### Who can help? No response ### Information - [X] The official...

triaged

[phi-3-mini-128k-instruct] Triton launch error with 24.06-trtllm-python-py3: [TensorRT-LLM][ERROR] Assertion failed: With communicationMode kLEADER, MPI worldSize is expected to be equal to tp*pp when participantIds are not specified (/tmp/tritonbuild/tensorrtllm/tensorrt_llm/cpp/tensorrt_llm/executor/executorImpl.cpp:356)

### System Info - CPU Architecture x86_64 - GPU: NVIDIA T4 * 4 (AWS g4dn.12xLarge) - TensorRT-LLM v0.10.0 ### Who can help? [QiJune](https://github.com/QiJune) @byshiue ### Information - [x] The official...

not a bug

waiting for feedback

functionality issue

Question about version 0.10.0, langchain-nvidia-ai-endpoints and langchain-core

Hi I self hosted a NIM of Llama3 8B and would like to use this config to test out guardrails: ``` models: - type: main engine: nim model: meta/llama3-8b-instruct parameters:...

enhancement

dependencies

AlignScore factcheck.co is not working when accuracy is lower than threshold

**How to reproduce the issue?** 1. setup AlignScore server and use large model 2. Reference Configuration: https://github.com/NVIDIA/NeMo-Guardrails/blob/develop/examples/configs/rag/fact_checking config: ![Image](https://github.com/user-attachments/assets/afe88f73-6d79-4611-b52e-e3d2b552e0fd) rails: factcheck.co and general.co **Who could help?** @drazvan **Description** Hi I...

bug