Tcc0403 issues

Results 19 issues of


                                            Tcc0403

Ensure In-place correctness checks work properly

## Summary Fix #272 It's a show case of how to trigger error properly. I only apply it to cross_entropy for demonstration, can apply to others if we want. ##...

Add ignore_index and label to jsd and fl-jsd

## Summary Resolve #277. ## Testing Done - Hardware Type: gpu-ci - [x] run `make test` to ensure correctness - [x] run `make checkstyle` to ensure code style - [x]...

In-place operations in triton kernel might result in incorrect gradient calculations

### 🐛 Describe the bug #254 [#262 (comments)](https://github.com/linkedin/Liger-Kernel/pull/262#issuecomment-2374260041) PyTorch’s autograd system records operations on tensors to construct a computational graph, which is used for computing gradients. When an in-place operation...

bug

Support Z Loss in CE

## Summary This PR aims to resolve #197 Implemented z loss in LigerCrossEntropy. note: `lse_square_scale` not exposed at flce yet, having issues passing the tests. ## Details ### For loss:...

reviewing

CI failure tracker

### 🐛 Describe the bug Most failures are related to transformers VLM changes ## unit test qwen2vl_mrope - [x] test_qwen2vl_mrope https://github.com/linkedin/Liger-Kernel/pull/728 monkey patch - [x] test_monkey_patch::test_apply_liger_kernel_to_instance_for_mllama_for_conditional_generation https://github.com/linkedin/Liger-Kernel/pull/737 - [x] test_monkey_patch::test_apply_liger_kernel_to_instance_for_gemma3...

CI failure due to transformers VLM change

### 🐛 Describe the bug CI details: [Qwen2VLConfig](https://github.com/linkedin/Liger-Kernel/actions/runs/15175089532/job/42680679210?pr=689#step:6:6549), [monkey patch impl related](https://github.com/linkedin/Liger-Kernel/actions/runs/15214784023/job/42797584570#step:5:1903) Text config is seperated out from the general config in transformers>=4.52.0. [Qwen2VLRotaryEmbedding](https://github.com/huggingface/transformers/pull/37268/files#diff-09bc594f9680f1d042fd485106c68022d77b59831697a00b3b38f12a3e40f395L103-R104) takes `Qwen2VLTextConfig` intead of `Qwen2VLConfig` now....

bug

huggingface

transfomers VLM base model change

### 🚀 The feature, motivation and pitch Upcoming refactor in transformers VLM models: https://github.com/huggingface/transformers/pull/37033 `XXXForConditionalGeneration` no longer has `language_model` attribute for `ForCausalLM`. It will be changed to `model` attribute to...

[WIP] Update benchmark data

## Summary Rerun all benchmarks scripts to get the latest data, so we can have a reliable baseline for future optimization. Note: orpo failing with `compile=True` (plotting with old data...

[RFC] More robust revert functions for convergence tests

### 🐛 Describe the bug Many discussions show that the current revert functions have several limitations, including: - incomplete revert: https://github.com/linkedin/Liger-Kernel/pull/627#issuecomment-2757281103 #542 - not automatically updating old reference: #385 -...

`revert_liger_kernel_to_xxx` can't revert LigerCrossEntropyLoss for transformers>=4.46.1

### 🐛 Describe the bug #369 found that CrossEntropyLoss wasn't applied in post-grad-acc-fix versions of transformers. Despite the fact that #375 fixed the issue, it didn't consider the revert functions...

bug