Lan Gong comments

Results 9 comments of


                                            Lan Gong

Trigger CI builds for pytorch/pytorch and domain libs in pytorch/builder via inter repository dispatch

Using Github Action is one option. We can explore other options (like probot) to do the same inter-repository trigger. Regarding the motivation: This is to have an e2e CI build...

Issue with FP16 Output for Half Precision Model

Hi @tanmayv25 Thanks for your reply. Our post-process service (using KServe) uses open inference protocol, so there's no way for it to accept FP16 output from Triton. However, after some...

Issue with FP16 Output for Half Precision Model

Hi @tanmayv25 What I meant earlier by "there's no way for it to accept FP16 output from Triton" is that the open inference protocol does not support fp16 type, i.e.,...

Issue with FP16 Output for Half Precision Model

Hi @tanmayv25 > You don't have to serialize(deserialize) FP16 to raw byte format. Within Triton, onnx backend will directly write the FP16 tensor data into the protobuf message repeated bytes...

Issue with FP16 Output for Half Precision Model

Hi @tanmayv25 Thank you for your confirmation. I need to check with the downstream service about supporting `raw_output_content` as `raw_input_content` when dtype is fp16. For now I have implemented a...

CPU Throttling when Deploying Triton with ONNX Backend on Kubernetes

Hi @fpetrini15 thank you for your reply! I have the follow-up questions below: 1. I have tried explicitly setting `intra_op_thread_count = ` (the number of maximum cpu cores allowed for...

CPU Throttling when Deploying Triton with ONNX Backend on Kubernetes

Update: The issue with sidecar CPU throttling has been resolved by increasing CPU cores and memories for the sidecar container due to the large input size of image tensors. However,...

CPU Throttling when Deploying Triton with ONNX Backend on Kubernetes

Removed "sidecar" from the issue title as it is a separate issue. The open issue is CPU throttling with the main container after configuring ONNX op thread count.

"warning gdb failed to set controlling terminal operation not permitted" when launching a GUI application.

For Linux users, if you see an error like this: ``` &"warning: GDB: Failed to set controlling terminal: Operation not permitted\n" &"Cannot create process: Operation not permitted\n" ``` make sure...