J
J
Hi, quick issue with mpiexec. Without it the program runs fine with 1 gpu (am running Horovod within a Docker container), but mpiexec hangs whenever it's invoked. I ran a...
Hi - I seem to get a super high CPU usage (100%+) on the resource intensive generating autoregressive samples phase. It is using the GPU, though only like 40-70% of...
This applies in an API only environment where I'm sending json workloads over. The workloads sent are consistently similar, with small changes here and there on certain values, and some...
From documentation it gives the detail on estimating max num tokens. However less clear for me is how to go about estimating the max batch size for the hardware that's...
Building on main and still same issue with medium batchsizes (20) freezing entire engine. This time I got some logs. This is a llama 2 70b model. Strange issue that...
Hi, we use RC for a Nextjs Capacitor app. The app pulls from deployed live web from a Vercel deployment, and we're migrating from this Cordova plugin to the Capacitor...
This is a critical issue as it's not just the one request being frozen, it effectively shuts down the entire Triton server so all other concurrent requests are frozen and...
Hi - few weeks ago I opened an issue on CPU bottleneck, finally found out the root cause. It wasn't the CPU bottleneck really - it was the CPU managing...