Aaron Orenstein issues

Results 26 issues of


                                            Aaron Orenstein

Limit loop unrolling

Tacotron2 causes massive loop unrolling resulting in very large graphs (26k nodes) which was causing inductor (and tracing itself) to choke. The unrolling size is controlled by the environment variable...

module: dynamo

ciflow/inductor

Fix dynamo issue "Failed running call_function <built-in method sparse_coo_tensor of type object at 0xDEADBEEF"

A small medley of fixes: - When validating sparse tensor indices don't check numel() if it's symbolic. - When validating sparse tensor indices if the indices are a FakeTensor then...

release notes: sparse

module: inductor

module: dynamo

ciflow/inductor

Prevent infinite recursion within Tensor.repr

`Tensor.__repr__` calls functions which can perform logging which ends up logging `self` (with `__repr__`) causing an infinite loop. Detect this in `__repr__` and early-out instead of recursing. Another possible fix...

RFC: Turn on no-undefined

Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * __->__ #124545

FakeTensor speedup: non-kwargs for TensorMetadata

Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * #125312 * __->__ #124226 * #124225 * #124224 * #124223 * #122911

ciflow/inductor

FakeTensor speedup: split prep_args_for_hash and verify_args_for_hash

Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * #125312 * #124226 * __->__ #124225 * #124224 * #124223 * #122911

ciflow/inductor

FakeTensor speedup: minor cleanups

Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * #125312 * #124226 * #124225 * __->__ #124224 * #124223 * #122911

ciflow/inductor

FakeTensor speedup: Split cache_key so we only validate once

When dispatching a fake tensor op we cache the result with `(op, args)` as the key. There are some args (such as one with a dynamic output shape) where the...

ciflow/inductor

FakeTensor speedup: Delay formatting stack trace until it's actually asked for.

When constructing a `FakeTensorMode`, instead of immediately formatting a full stack trace, grab the traceback and only format it on demand. 4.2% FakeTensor perf win on the microbenchmark. ``` import...

topic: not user facing

ciflow/inductor

Ensure that vmap is restored properly if an exception is thrown during frame eval

We save and restore the DynamicLayerStack during frame eval but since fx graph has no way to express a try/finally we just assume it will happen. If we throw an...

module: dynamo

ciflow/inductor