Boris Fomitchev
Boris Fomitchev
Hi, looks like great work on determining landmarks! How do I go from here to actual aligned image, preferably via torch ?
Nice move to include SFD detector! How fast is your PyTorch version of SFD? I remember trying some earlier port of SFD to PyTorch and it was ~10x slower than...
**Describe the Bug** Here, while exporting one of Nemo Megatron modules that use tensor_parallel.ColumnParallelLinear. Happens with ToT. This used to work with previous releases. Apparently, the problem is that, inference/no-grad...
# What does this PR do ? Add a one line overview of what this PR aims to accomplish. **Collection**: [Note which collection this PR will affect] # Changelog -...
**Is your feature request related to a problem? Please describe.** I can't supply compile spec to trtorchc now. **Describe the solution you'd like** **Describe alternatives you've considered** **Additional context**
Signed-off-by: Boris Fomitchev Fixes #91351 As for unit tests - in this PR I only fixed LSTM unit test to properly use dynamic axes and expose export issue by running...
In atomicAdd overloads, native atomicAdd should be used for __half and __nv_bfloat16, instead of AtomicAddDecimalImpl. Like this: ``` #if defined(USE_ROCM) || (defined(__CUDA_ARCH__) && (__CUDA_ARCH__ < 700 || CUDA_VERSION < 10000))...
There was an issue filed previously of lack of support for bfloat16 - looks like it's still there. I am getting the same error : RuntimeError: "_" not implemented for...
# What does this PR do ? Add a one line overview of what this PR aims to accomplish. **Collection**: [Note which collection this PR will affect] # Changelog -...
This PR depends on https://github.com/rapidsai/cugraph-ops/pull/624 We need to discuss/finalize proposed API changes. Export tests added. Legacy ONNX, TRT and TS tests work, also compile(). dynamo_export and export.export() tests should probably...