Eikan Wang issues

Results 15 issues of


                                            Eikan Wang

Fix the performance issue that the for-loop before ExternallCall could not be parallelized.

Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * __->__ #85056 Currently, NNC only parallelizes the loop statement of the graph outputs. The logic could bypass some loop statements that could be...

oncall: jit

open source

cla signed

NNC

intel priority

release notes: jit

Use high precision accmulate buffer for bf16 accmulation

Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * #85140 * __->__ #84402 * #84041 Accumulation operation is not friendly to BFloat16 because its mantissa part is only 7bits while the operand...

oncall: jit

open source

cla signed

NNC

release notes: jit

Support BF16ImmPtr

Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * #85140 * #84402 * __->__ #84041 - To support BF16 Immediate value by converting it to uint16. The behavior is as same as...

oncall: jit

open source

cla signed

NNC

release notes: jit

Optimize to if the datatyep of the source tensor is as same as the dest datatype

Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * __->__ #85140 * #84402 * #84041 The AMP inserts `_autocast_to_reduced_precision` and `_autocast_to_full_precision` automatically. The aten implementation provides a fast path to bypass the...

oncall: jit

open source

cla signed

NNC

release notes: jit

[Inductor] Eliminate redundant to_dtype node

Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * __->__ #96650 cc @jgong5 @mingfeima @XiaobingSuper @sanchitintel @ashokei @jingxu10 @soumith @voznesenskym @penguinwu @anijain2305 @Guobing-Chen @zhuhaozhe @blzheng @Xia-Weiwen @wenzhe-nrv @jiayisunx @peterbell10 @desertfire

open source

ciflow/trunk

topic: not user facing

intel

module: inductor

ciflow/inductor

Modularize aten parameter parser and checker

Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * #126883 * #126517 * #125897 * #125831 * #125819 * __->__ #125308 * #124926 In this PR, we abstracted the different types of...

open source

oncall: pt2

module: inductor

module: dynamo

ciflow/inductor

Support aten operations with out tensor

Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * #126883 * #126517 * #125897 * #125831 * #125819 * #125308 * __->__ #124926 This PR intends to support the aten operations with...

open source

module: inductor

ciflow/inductor

release notes: AO frontend

[2/N] Non-Tensor: Scalar Support: Add scalar to the cache for eager-through-torch.compile

Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * #125308 * #124926 * __->__ #124070 * #124177 * #116368 * #124836 Add scalar information to the kernel configuration. #### Additional Context Currently,...

open source

topic: not user facing

module: inductor

ciflow/inductor

[1/N] Non-Tensor: Scalar Support: Enable aot compile to support aten operations with scalar input like alpha

Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * #125308 * #124926 * #124070 * __->__ #124177 * #116368 * #124836 Some operations have a scalar input parameter, like `torch.add(a, b, alpha=2.0)`....

open source

topic: not user facing

module: inductor

ciflow/inductor

Add a cache mechanism to accelerate torch.compile-for-eager

Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * #125308 * #124926 * #124070 * #124177 * __->__ #116368 * #124836 This PR is a follow-up of RFC https://github.com/pytorch/pytorch/issues/115545. In this PR,...

open source

topic: not user facing

module: inductor

module: dynamo

ciflow/inductor