HelloWorldBeginner
HelloWorldBeginner
### Describe the bug ``` Traceback (most recent call last): File "train_controlnet_sdxl.py", line 1252, in main(args) File "train_controlnet_sdxl.py", line 1013, in main train_dataset = train_dataset.map(compute_embeddings_fn, batched=True, new_fingerprint=new_fingerprint) File "/home/miniconda3/envs/mhh_df/lib/python3.8/site-packages/datasets/arrow_dataset.py", line...
# What does this PR do? Added support for SDXL finetune on AscendNPU and fixed the bug causing the hang out when saving models using the deepspeed distributed framework. DeepSpeed...
[NPU] Support Llava training and inference for Ascend NPU. I've modified some codes to add support for NPU, allowing LLAVA to perform both training and inference on NPU. It works...
# What does this PR do? 1. Adds NPU flash attention support for NPU, similar to #7816. 2. Fixes a bug related to saving the model when using deepspeed, also...
I changed the train_step parameter inside image_finetune.yaml to 2000 steps, which will train multiple epochs, but the machine gets stuck for five minutes at the start of each epoch. ...
I've modified some codes to add support for Ascend NPU, allowing Animatediff to train and inference on NPU. It works fine on NPU. NPU training  NPU Inference 
### Question LLava is a great work, I have adapted llava to Ascend NPU hardware, enabling pre-training, inference, and evaluation on the Ascend NPU. I'm wondering if NPU is also...
add **actor_rollout_ref.rollout.mode="async"** in recipe/dapo/run_dapo_qwen2.5_32b.sh get error ``` [36m(AsyncvLLMServer pid=361610)[0m instance_id: 6f66dda9-3270-44cf-823e-6bbb7a51c151:Hotrws:1:0 initializes with external actors: ['HotrwsWorkerDict_0:0'] Error executing job with overrides: ['data.train_files=/home//0723/data/dapo-math-17k.parquet', 'data.val_files=/home//0723/data/aime-2024.parquet', 'data.prompt_key=prompt', 'data.truncation=left', 'data.max_prompt_length=2048', 'data.max_response_length=2048', 'data.gen_batch_size=6', 'data.train_batch_size=2', 'actor_rollout_ref.rollout.n=16',...
### What does this PR do? We conducted tests on the partial rollout feature using the DAPO algorithm with the qwen3-0.6B and qwen2.5-7B models. The blue curve represents the scenario...
# Motivation During the reinforcement learning training process, as the model performance continues to improve, the output response sequences also keep lengthening — especially in the slow thinking mode, the...