ms-swift icon indicating copy to clipboard operation
ms-swift copied to clipboard

When debugging in VSCode, the program does not stop at the breakpoints.

Open YasmineXXX opened this issue 1 year ago • 2 comments

Describe the bug

        {
            "name": "Python: debug_cl",
            "type": "debugpy",
            "request": "launch",
            "program": "swift/cli/main.py",
            "console": "integratedTerminal",
            "subProcess": true,
            "justMyCode": false,
            "env": {
                "NPROC_PER_NODE": "4",
                "CUDA_VISIBLE_DEVICES": "0,1,2,3",
                "PYTHONPATH": "./"
            },
            "args": [
                "sft",
                "--sft_type", "lora",
                "--model_type", "internvl2-4b",
                "--custom_train_dataset_path", "/mnt/data/msrvtt/llm_data/msrvtt_cot_3500.jsonl",
                "--resume_from_checkpoint", "/mnt/code/swift_all/output/internvl2-4b/v2-20240827-161031/checkpoint-39827",
                "--resume_only_model", "True",
                "--save_strategy", "epoch",
                "--num_train_epochs", "10",
                "--save_total_limit", "10000",
                "--ddp_find_unused_parameters", "true",
                "--max_length", "4096",
                "--lora_rank", "8",
                "--lora_alpha", "32",
                "--lora_dropout_p", "0.05",
                "--lora_target_modules", "ALL",
                "--gradient_checkpointing", "true",
                "--batch_size", "1",
                "--weight_decay", "0.01",
                "--learning_rate", "5e-5",
                "--save_steps", "3000",
                "--gradient_accumulation_steps", "1",
                "--max_grad_norm", "0.5",
                "--warmup_ratio", "0.03",
                "--dtype", "bf16",
                "--deepspeed", "default-zero2"
            ]
        },

I set a breakpoint at the relevant part of the llm_sft function in swift/llm/sft.py, but the program does not stop at the expected position. What could be the possible reason for this?

YasmineXXX avatar Sep 26 '24 08:09 YasmineXXX

    {
        "name": "torchrun4",
        "type": "python",
        "request": "launch",
        "module": "torch.distributed.run",
        "console": "integratedTerminal",
        "justMyCode": false,
        "args": [
            "--master_port",
            "29510",
            "--nproc_per_node",
            "4",
            "${file}"
        ]
    },

Jintao-Huang avatar Sep 26 '24 09:09 Jintao-Huang

You can try running like this.

https://swift.readthedocs.io/zh-cn/latest/Instruction/LLM%E5%BE%AE%E8%B0%83%E6%96%87%E6%A1%A3.html#python

from swift.llm import sft_main, SftArguments

sft_main(SftArguments(...))

Jintao-Huang avatar Sep 26 '24 14:09 Jintao-Huang

You can try running like this.

https://swift.readthedocs.io/zh-cn/latest/Instruction/LLM%E5%BE%AE%E8%B0%83%E6%96%87%E6%A1%A3.html#python

from swift.llm import sft_main, SftArguments

sft_main(SftArguments(...))

首先感谢你们出色的工作。这个链接失效了,请问该如何修改vscode的launch.json文件,从而可以在vscode中进行debug?

Qia98 avatar Dec 25 '24 03:12 Qia98