Kevin Stephano

Results 10 issues of Kevin Stephano

NM

triaged
open source
cla signed
module: nvfuser

1. The Transformer data processing scripts had logging imports that did not really exist and caused errors for users attempting to use the scripts. 1. Updated the download path for...

### 🚀 The feature, motivation and pitch The task is to fuse `log_softmax+gather`. Naoya said it depends on his [resize function work](https://github.com/csarofeen/pytorch/pull/2480). The idea being that the tensor output of...

### 🚀 The feature, motivation and pitch `sign` has the behavior `x / abs(x)` for complex. ### Alternatives _No response_ ### Additional context _No response_

mruberry

### 🐛 Describe the bug I have a horizontal fusion situation with `reshape` that I would like to understand if this can be fused. I think we have a knob...

### 🚀 The feature, motivation and pitch Pytorch supports tensor dimensions of up to size 25.

mruberry

### 🚀 The feature, motivation and pitch Add Cuda Kernel and Scheduled IR print functions to the `FusionDefinition` in python. Perhaps this an API? ``` fd.cuda_kernel(inputs) fd.last_executed_cuda_kernel() fd.scheduled_ir(inputs) fd.last_scheduled_ir() ```...

mruberry

### 🐛 Describe the bug Benchmark commandline: ``` PYTORCH_NVFUSER_DUMP=python_definition,fusion_args python -u benchmarks/huggingface.py --training -d cuda --fast --backend nvprims_nvfuser --skip-accuracy-check --performance --only BertForMaskedLM --amp ``` Fusion Repro: ``` import torch from...

### 🐛 Describe the bug This is our number one failure signature. Important to fix! This is the error: ``` ERROR:common:Failed for dynamo CUDA error: operation failed due to a...

### 🐛 Describe the bug Benchmark Command: ``` python -u benchmarks/huggingface.py --training -d cuda --fast --backend nvprims_nvfuser --skip-accuracy-check --performance --only BertForMaskedLM --amp ``` Log_Softmax is notably not being fused even...