J issues

Results 8 issues of

mpiexec hangs on creating pad

Hi, quick issue with mpiexec. Without it the program runs fine with 1 gpu (am running Horovod within a Docker container), but mpiexec hangs whenever it's invoked. I ran a...

CPU usage in making autoregressive samples

Hi - I seem to get a super high CPU usage (100%+) on the resource intensive generating autoregressive samples phase. It is using the GPU, though only like 40-70% of...

API calls on same json periodically cause nodes to hang & slow down massively

This applies in an API only environment where I'm sending json workloads over. The workloads sent are consistently similar, with small changes here and there on certain values, and some...

Max num tokens & max batch size sanity checks

From documentation it gives the detail on estimating max num tokens. However less clear for me is how to go about estimating the max batch size for the hardware that's...

Illegal memory access when medium batch sizes on using bad_words

Building on main and still same issue with medium batchsizes (20) freezing entire engine. This time I got some logs. This is a llama 2 70b model. Strange issue that...

bug

triaged

Matching version of purchases hybrid with Capacitor for migration

Hi, we use RC for a Nextjs Capacitor app. The app pulls from deployed live web from a Vercel deployment, and we're migrating from this Cordova plugin to the Capacitor...

enhancement

Inflight batching freezes whole server when using end_id or when input has \" (and sometimes without known trigger) at moderate batchsizes

This is a critical issue as it's not just the one request being frozen, it effectively shuts down the entire Triton server so all other concurrent requests are frozen and...

triaged

Option for disabling mmap for safetensors loading for network storage users

Hi - few weeks ago I opened an issue on CPU bottleneck, finally found out the root cause. It wasn't the CPU bottleneck really - it was the CPU managing...