cutile-python
cutile-python copied to clipboard
cuTile is a programming model for writing parallel kernels for NVIDIA GPUs
### Is this a new feature, an improvement, or a change to existing functionality? New Feature ### How would you describe the priority of this feature request? Medium ### Please...
### Is this a new feature, an improvement, or a change to existing functionality? Improvement ### How would you describe the priority of this feature request? Medium ### Please provide...
### Version 1.0.0 ### Version 13.1 ### Which installation method(s) does this occur on? Pip ### Describe the bug. `Failed to launch cuTile kernel: PTX JIT compiler library not found`...
``` %input = constant : tile %0 = reduce %input dim=0 identities=[0.000000e+0 : f32] : tile -> tile (%input_arg: tile, %input_accum: tile) { %add_result = addf %input_arg, %input_accum : tile...
Hi team! I found that for some benchmarks in /test, the performance of cutile python is much worse than that of Pytorch. Is this normal?
## Description This should not have any changes on the implementation other than cleaning up a potential typo and making the code more consistent. ## Checklist - [x] I am...
The original implementation does not support qk_head_dim != v_head_dim, which is needed in Multi-head Latent Attention. Also fix some test code logic. ## Description The original implementation does not support...
### Description Is there a sample for a fully fused SwiGLU layer? I see that in tile gym there is a [fused_swiglu.py](https://github.com/NVIDIA/TileGym/blob/main/src/tilegym/ops/fused_swiglu.py) example, but that is partially fused using a...
### Is this a new feature, an improvement, or a change to existing functionality? New Feature ### How would you describe the priority of this feature request? High ### Please...
## Description ## Checklist - [x] I am familiar with the [Contributing Guidelines](https://github.com/NVIDIA/cutile-python/blob/main/CONTRIBUTING.md). - [x] New or existing tests cover these changes. - [x] The documentation is up to date...