Mario Lezcano Casado comments

Results 254 comments of


                                            Mario Lezcano Casado

Solving the under/overflow for complex division

There's still at least one xfail that needs to be removed (there's an "unexpected success" in a test) but otherwise this is ready to go!

Solving the under/overflow for complex division

@pytorchbot merge

Solving the under/overflow for complex division

That code you found is from caffe. I don't think that code is tested in CI. So, it seems that it was passing on CUDA (and MPS I guess as...

Solving the under/overflow for complex division

It also passes on CUDA. See https://github.com/pytorch/pytorch/actions/runs/4004960425/jobs/6876076243 (or see how there are no failing CUDA jobs when you removed the xfail).

Solving the under/overflow for complex division

@pytorchbot merge

Improve the performance of reductions/softmax in the mid/large reduction range

These are benchmarks on different shapes of a softmax: This PR: ``` (1, 67108864) inductor: 857.4533462524414 us (2, 33554432) inductor: 858.1042289733887 us (4, 16777216) inductor: 850.0027656555176 us (8, 8388608) inductor:...

Improve the performance of reductions/softmax in the mid/large reduction range

Running the command in https://github.com/pytorch/pytorch/pull/91316#issuecomment-1363421509, it seems that this patch is terrible. It needs further investigation on how to make it scale properly.

Improve the performance of reductions/softmax in the mid/large reduction range

Closing this one, as this will probably be superseded by the `autotune-max` option. The code in this PR that gives a loose bound on the number of registers needed to...

Segfault when running torch.remainder

I think it would be good to wrap this into one umbrella issue that has a long list or points to one place with a list of these to avoid...

[dynamo] Add support for Tensor.numpy(), torch.from_numpy() and np.meshgrid()

@thomasjpfan did some experiments with this PR and the compat layer, and it seems like there are still quite a few things that should be sorted. See [this notebook](https://gist.github.com/thomasjpfan/513115f8c6265b83c9fe69ec9f02f11a).