FBGEMM icon indicating copy to clipboard operation
FBGEMM copied to clipboard

Support local mask

Open Aya-ZIbra opened this issue 4 months ago • 2 comments

Summary: This diff introduces changes to support local masks in the decode attn implementation. The changes include adding window_left and window_right parameters to the decode function, modifying the GenRunner class to include a Mask template parameter, and modifying the collective_builder to include a Mask parameter. The changes also include modifying the load_cpasync_warpspecialized class to include window_size_left and window_size_right parameters.

Currrently, softmax is applied in a 3-loop setting. Next: Optimize these iteration and benchmark perf.

Differential Revision: D84778050

Aya-ZIbra avatar Oct 16 '25 16:10 Aya-ZIbra

@Aya-ZIbra has exported this pull request. If you are a Meta employee, you can view the originating Diff in D84778050.

meta-codesync[bot] avatar Oct 16 '25 16:10 meta-codesync[bot]

Deploy Preview for pytorch-fbgemm-docs ready!

Name Link
Latest commit 06d64245676181ca7588b385db2098ae98bf9a26
Latest deploy log https://app.netlify.com/projects/pytorch-fbgemm-docs/deploys/68f123b8248d2900089c65e7
Deploy Preview https://deploy-preview-5015--pytorch-fbgemm-docs.netlify.app
Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

netlify[bot] avatar Oct 16 '25 16:10 netlify[bot]