maedtb
maedtb
This is based on the current CAME optimizer and the reference code. Paper: https://arxiv.org/html/2501.18427v3 This was requested in issue #746. I've done some quick tests myself, but ideally more people...
### What happened? Running the latest PyTorch nightly produces a warning in the OneTrainer `AdditionalEmbeddingWrapper.py` file at the start of training. It looks like a new runtime warning in the...
tl;dr: bitsandbytes is quantizing a `nn.NonDynamicallyQuantizableLinear` output in JoyCaption. We revert it back to a `nn.Linear` (as `nn.NonDynamicallyQuantizableLinear` is not a constructable type.) We also set the dtype in a...
Adds basic support for InternVL2 models.
### New Added support for a minimum character-tag probability. If a character tag's probability is less than this value, it will be discarded. Set to `1.01` to disable character tags....
This makes changes to a settings `target_resolution` field force the use of different cache keys for `image` latents. This fixes changing the base settings (not the per-concept one) `target resolution`...
From what I can tell, the parameter state `RMS` is only used by Adafactor's [Adaptive Learning Rate](https://github.com/huggingface/transformers/blob/c6ee0b1da8ff57102548430e18480fa78a106022/src/transformers/optimization.py#L729-L730) feature, which is not supported by `CAME`. Can we remove this?