cyan
Results
3
comments of
cyan
Short sequences is OK. When run long sequences (10k+) repeatly, it is slow. And sometimes stuck.
> > @hanzhi713 have you compared [pytorch/pytorch#114001](https://github.com/pytorch/pytorch/pull/114001) with your custom reduce ops? > > I took a glimpse and I would say performance would be similar (essentially the same idea,...
Same issue too. Solved by upgrade tensorrt version to 10.11.0.33