Medusa icon indicating copy to clipboard operation
Medusa copied to clipboard

Fix sharing of resblock layers (from Liger-Kernel#269)

Open loreloc opened this issue 10 months ago • 0 comments

When using multiple residual block in medusa MLP heads, parameters are wrongly shared.

This was already reported in Hydra and already fixed in the Liger-Kernel repository https://github.com/zankner/Hydra/issues/8 https://github.com/linkedin/Liger-Kernel/pull/269

loreloc avatar Mar 17 '25 12:03 loreloc