bugzyz

Results 7 comments of bugzyz

Also, we did some nsight GPU profiling on these execution and got some strange finding. We hope these will help. no.1: optimum took lots of time on data copy optimum...

Hi @JingyaHuang, Thanks for the new feature! Sure, we want to have a try on this to see onnx/optimum's performance with this enhancement. we will try the new release today...

@JingyaHuang Thanks for the new feature. I validate the latest optimum with previous code. The optimum ran faster than original transformer one. (latency: 2.73s vs 2.13s) ๐Ÿ†˜๐Ÿ†˜๐Ÿ†˜But there's a warning...

Thanks @JingyaHuang for the confirmation. One more minor found from us is that we used to create device by `device = torch.device('cuda')`. But it leads to this error after using...

The feature is very valuable for user like us to use optimum. Big thanks! ๐ŸŽ‰ @JingyaHuang

And also need support on stateless mode. We tried some app client like mcp-inspector but it hasn't supported the stateless mode well.