Maze
Maze
因为 amp 要求可训练参数是`torch.float32`类型。lora模块的参数是`torch.float32`类型,但是`modules_to_save='embed_tokens,lm_head'`中的参数在`from_pretrained`时初始化为`torch.float16`,又同时参与amp更新梯度,所以会报错。 解决方案: 1. 对于llama模型可以手动转换`embed_tokens`和`lm_head`层为`torch.float32` 2. 对于任意模型,可以遍历参数,把`requires_grad`的参数全都手动设为`torch.float32` ```python model.print_trainable_parameters() # monkey patch logger.info(f"model.modules_to_save: {model.modules_to_save}") trainable_not_float32 = [name for name, param in model.named_parameters() if param.requires_grad and param.dtype != torch.float] if...
```python model = AutoModel.from_pretrained( MODEL_PATH, load_in_8bit=True, torch_dtype=torch.float16, device_map='auto', ) model = prepare_model_for_int8_training(model) ``` The reason is simple. Use the above example to illustrate. `torch_dtype=torch.float16` means load model weights with `torch.float16`,...
> It is hard to say and may take some time. Please stay tuned ;D hello, can you provide the training code for LayoutTransformer? I found that the text bbox...
> Thansk for your attention to TextDiffuser. [x1, y1, x2, y2] denotes the coords of top-left and bottom-right points, which belong to the minimum horizontal rectangle of the 4 point...
> 请问你们是用什么系统使用昇腾芯片的?我在执行`pip install -e .`的时候,decord安装失败,提示没有对应aarch64的版本,是要手动编译吗? decord 包没有 arm平台上的预编译的whl包,需要从源码build
故意的,伪开源,你也验证不了算法效果。
> @Doctor-L-end - thanks for contacting us with your feedback. Based on your issue submitted, I believe this translates to : " I personally feel that the packaging degree of...
> * The artifact issue was because of a bug in the Diffusers training scripts, which should have been addressed in cog-factory by now @a-r-r-o-w hi, Can you tell us...
> Hey, yes I do! We worked together with Yuxuan from the CogVideoX team here: https://github.com/a-r-r-o-w/cogvideox-factory > > * 50+ videos is great for finetuning. I generally use ~200 for...