jjsjann123 issues

Results 31 issues of


                                            jjsjann123

Limits constant chunk propagation for pw-node-only

Fixes #82899 TODO: - [ ] python test for the breakage.

oncall: jit

open source

cla signed

[NVFUSER] refactor nvfuser build

This PR is the first step towards refactors the build for nvfuser in order to have the coegen being a standalone library. Contents inside this PR: 1. nvfuser code base...

module: cpu

triaged

open source

NNC

ciflow/trunk

release notes: jit

release notes: quantization

module: nvfuser

skip-pr-sanity-checks

module: inductor

module: dynamo

ciflow/inductor

codegen error: RuntimeError: producer->getMemoryType() == MemoryType::Global || producer->getMemoryType() == MemoryType::Shared INTERNAL ASSERT FAILED

### 🐛 Describe the bug The issue was coming from lots of pytorch slow_tests. ``` %kernel { T25_l[ ithreadIdx.x212{( ceilDiv(( ceilDiv(( ceilDiv(( i5 * ( i6 * 1 ) ),...

bookend `view` should be stripped from fusion

### 🚀 The feature, motivation and pitch Currently the handling of view in scheduler is sub-optimal. For views inside the fusion group that connects fusion, it makes sense, since this...

index_select + cast triggers codegen failure. `RuntimeError: Lookup input must be a fusion input`

### 🐛 Describe the bug python repro executed with: ``` PYTORCH_NVFUSER_DISABLE=fallback PYTORCH_NVFUSER_DUMP=segmented_fusion PYTORCH_NVFUSER_ENABLE=graph_op_fusion PYTORCH_JIT_LOG_LEVEL=graph_fuser python repro_index_select.py ``` ``` # repro_index_select.py import torch torch._C._jit_set_nvfuser_single_node_mode(True) def fn(x, y): o = torch.index_select(x, 0,...

Mega issue tracking `torchbenchPerf` on benchmark runs

### 🐛 Describe the bug Just a common place tracking all issues that we run into with our benchmark. ## Functional Issues - [x] `RuntimeError: rhs_i >= 0 && lhs_i...

codegen missing fp16 math support

### 🐛 Describe the bug For the issues I'm seeing in our benchmark, I think we can work around it by making batch_norm promotion explicit in the graph. So I'll...

Codegen error: `root_vals.find(inp) != root_vals.end() INTERNAL ASSERT FAILED ... Invalid tensor domain`

### 🐛 Describe the bug Running into issue on `torchbenchPerf`. (note the the cpp repro also fails on devel with a different error, which is already patched in #2067, but...

Codegen Error: thread_predicates_.find(tv_inp) != thread_predicates_.end() INTERNAL ASSERT FAILED ... Thread predicate map was not initialized

### 🐛 Describe the bug Got this issue from model `pnasnet5large` `python -u benchmarks/timm_models.py --training -d cuda --fast --backend nvprims_nvfuser --skip-accuracy-check --performance --generate-aot-autograd-stats -k pnasnet5large` Error message: ``` C++ exception...

codegen error: Reducing a tensor once it's gone under transformations is not permitted at this time. Please set reductions before calling split/merge/computeAt

### 🐛 Describe the bug Repro python script ``` import torch from torch._C._nvfuser import FusionDefinition, Fusion, DataType def nvfuser_fusion_id6(fd : FusionDefinition) -> None : T0 = fd.define_tensor(symbolic_sizes=[-1, -1, -1], contiguous=[False,...