dan_the_3rd

Results 14 issues of dan_the_3rd

## 🐞 the repeat operation does not work with dynamic inputs - See code: "repeat" only seems to work with constant values for the reps - The same problem happens...

bug
triaged
pytorch
Flexable Shape

**Describe the bug** Fused GEMM example gives the wrong result for some values of `problemSize1.K`. **Steps/Code to reproduce bug** Set the following problem sizes in `examples/13_two_tensor_op_fusion/fused_two_gemms_f16_sm80_shmem.cu` ```c++ cutlass::gemm::GemmCoord gemm_f16_sm80_problem_size_0(128*640, 48,...

bug
? - Needs Triage

Hello, When transferring a lot of data over a TCP tunnel, the terminal freezes during the transfer (also happens to me for regular SSH tunnels), and sometimes for even longer...

## Is your feature request related to a problem? Please describe. Currently, TorchServe adds headers that prevent from caching the inference results: https://github.com/pytorch/serve/blob/30f83500b0850e26ec55581f48a9307b1986f9f9/frontend/server/src/main/java/org/pytorch/serve/util/NettyUtils.java#L187-L190 This prevents some reverse-proxies like `nginx` from...

enhancement
help wanted

Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * __->__ #540 * #539

CLA Signed

Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * #540 * __->__ #539 ... by reducing the number of ATen imports, and skipping them altogether when building the actual kernels 13mn ->...

CLA Signed

Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * __->__ #495 cc @tridao

CLA Signed

Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * __->__ #467 **PERFORMANCE** This makes performance worse in f16 :( But I think we need it for stability bw P100/V100 (f32/f16) ``` [----------------------------------------...

CLA Signed