r0 issues

Results 9 issues of

r0

Added hardcoded float32 for the activation layer

Added hardcoding of `float32` for the activation layer. This will prevent overriding from the following call: ``` from tensorflow.keras import mixed_precision mixed_precision.set_global_policy('mixed_float16') ``` This can help enable mixed precision training...

tqdm description for dataset processing

Add semantic_m decoder

Move audio files list out of the main dataset class

Audio files list is being created and populated in the main dataset class. This can lead to more memory usage

Lazy loading of models

Currently whenever we do a fresh install of audiotoken, and import it using `import audiotoken`, the models are downloaded then and there. Model should be loaded/downloaded in a lazy fashion...

Fusing matmuls into one

- Fused Q,K,V projection into one matmul - Fused MHA into one single layer instead of concatenation of 12

ENH: Tie weights for target_modules in Lora (#2864)

Solves #2864 for `target_modules` Enables `ensure_weight_tying` flag in `LoraConfig` for `target_modules`. For LoRA, if any of the tied layers are added to `target_modules` and `ensure_weight_tying == True`, the adapters added...

Added kernels from kernel hub for Bamba model

# What does this PR do? Adds support for `mamba_ssm` and `causal_conv1d` kernels from the kernel-hub in bamba models. Fixes # (issue) https://github.com/huggingface/transformers/issues/41208 ## Before submitting - [ ] This...

DataCollatorWithFlattening only accepts labels as list

### System Info - `transformers` version: 4.55.4 - Platform: Linux-5.14.0-284.73.1.el9_2.x86_64-x86_64-with-glibc2.39 - Python version: 3.12.3 - Huggingface_hub version: 0.36.0 - Safetensors version: 0.5.2 - Accelerate version: 1.12.0 - Accelerate config: not...

bug