y-sq

Results 4 issues of y-sq

Summary: The issue: When using float8 training with FSDP, we have these tensors in the forward_backward graph: - Without fp8-all-gather: original_weight (all-gather output, sharded) - fp8_weight - fp8_weight_transpose (needed in...

CLA Signed
fb-exported

Summary: The diff modifies the `padding` option and added tests with `compile`: * For the scaled_mm of shape MxKxN, the current `inner_padding` option only pads the `K` dimension. However, if...

CLA Signed
fb-exported

test

CLA Signed

**Summary**: * Added a config option `defer_reduction_split`. When it's enabled, if `num_splits` gets a `>1` result, return `ReductionHint.DEFERRED_SPLIT, 1` instead. * In scheduler, when fusing nodes, if a node is...

fb-exported
topic: not user facing
module: inductor
ciflow/inductor