Fast-LLM icon indicating copy to clipboard operation
Fast-LLM copied to clipboard

Changes for basic LLaDA style diffusion masking support

Open gopeshh opened this issue 10 months ago • 0 comments

✨ Description

Cleaned up the code a bit:

  1. Added Diffusion config object as we discussed
  2. removed noise schedules for v1
  3. Moved loss calculation to head.py (as I noticed language modelling loss is computed there)
  4. Moved bidirectional attention to preprocessing.py file as it seems like the attention mask is computed there

Of course still a WIP but feel free to leave comments and suggestions

These are changes to address this PR: https://github.com/ServiceNow/Fast-LLM/issues/208#issue-2950083282

gopeshh avatar Apr 21 '25 12:04 gopeshh