jjsjann123

Results 31 issues of jjsjann123

Fixes #82899 TODO: - [ ] python test for the breakage.

oncall: jit
open source
cla signed

This PR is the first step towards refactors the build for nvfuser in order to have the coegen being a standalone library. Contents inside this PR: 1. nvfuser code base...

module: cpu
triaged
open source
NNC
ciflow/trunk
release notes: jit
release notes: quantization
module: nvfuser
skip-pr-sanity-checks
module: inductor
module: dynamo
ciflow/inductor

### 🐛 Describe the bug The issue was coming from lots of pytorch slow_tests. ``` %kernel { T25_l[ ithreadIdx.x212{( ceilDiv(( ceilDiv(( ceilDiv(( i5 * ( i6 * 1 ) ),...

### 🚀 The feature, motivation and pitch Currently the handling of view in scheduler is sub-optimal. For views inside the fusion group that connects fusion, it makes sense, since this...

### 🐛 Describe the bug python repro executed with: ``` PYTORCH_NVFUSER_DISABLE=fallback PYTORCH_NVFUSER_DUMP=segmented_fusion PYTORCH_NVFUSER_ENABLE=graph_op_fusion PYTORCH_JIT_LOG_LEVEL=graph_fuser python repro_index_select.py ``` ``` # repro_index_select.py import torch torch._C._jit_set_nvfuser_single_node_mode(True) def fn(x, y): o = torch.index_select(x, 0,...

### 🐛 Describe the bug Just a common place tracking all issues that we run into with our benchmark. ## Functional Issues - [x] `RuntimeError: rhs_i >= 0 && lhs_i...

### 🐛 Describe the bug For the issues I'm seeing in our benchmark, I think we can work around it by making batch_norm promotion explicit in the graph. So I'll...

### 🐛 Describe the bug Running into issue on `torchbenchPerf`. (note the the cpp repro also fails on devel with a different error, which is already patched in #2067, but...

### 🐛 Describe the bug Got this issue from model `pnasnet5large` `python -u benchmarks/timm_models.py --training -d cuda --fast --backend nvprims_nvfuser --skip-accuracy-check --performance --generate-aot-autograd-stats -k pnasnet5large` Error message: ``` C++ exception...

### 🐛 Describe the bug Repro python script ``` import torch from torch._C._nvfuser import FusionDefinition, Fusion, DataType def nvfuser_fusion_id6(fd : FusionDefinition) -> None : T0 = fd.define_tensor(symbolic_sizes=[-1, -1, -1], contiguous=[False,...