raghavgarg97 issues

Results 4 issues of


                                            raghavgarg97

AttributeError: 'LlamaModel' object has no attribute '_use_flash_attention_2'

I was running speedup.sh with Llama model but got this issue trace. The error follows from the file Consistency_LLM/cllm/cllm_llama_modeling.py https://github.com/hao-ai-lab/Consistency_LLM/blob/b2a7283bafd65121e868b92fbeb811aac140be17/cllm/cllm_llama_modeling.py#L154 the code needs to be updated to ```if self.model.config._attn_implementation=='flash_attention_2':``` Do...

Issue while runing test_train.small.gemma.infini.py

Did anyone face this issue? warnings.warn( Traceback (most recent call last): File "test_train.small.gemma.infini.py", line 150, in trainer.train() File "/transformers/src/transformers/trainer.py", line 1885, in train return inner_training_loop( File "/transformers/src/transformers/trainer.py", line 2216, in...

[Bug]:Phi-4-Mini giving garbage outputs with torch 2.6.0 and vllm==0.7.3 with multiple parallel requests

### Your current environment ``` PyTorch version: 2.5.1+cu124 Is debug build: False CUDA used to build PyTorch: 12.4 ROCM used to build PyTorch: N/A OS: Ubuntu 22.04.5 LTS (x86_64) GCC...

bug

[REQUEST]: Add support for thinking_budget in transformers repo

### Has this been supported or requested before? - [x] I have checked [the GitHub README](https://github.com/QwenLM/Qwen3). - [x] I have checked [the Qwen documentation](https://qwen.readthedocs.io). - [x] I have checked the...