DeepSpeed

DeepSpeed copied to clipboard

Reame
Issues

[BUG] max_grad_norm not effect

Open yiyepiaoling0715 opened this issue 1 year ago • 6 comments

Describe the bug A clear and concise description of what the bug is. deepseed config gradient_clip set as auto max_grad_norm set as 1.0 but it not effects deepspeed version is 0.14.5,when i change to 0.15.3,0.15.4,it has the same quesiton. I use Firefly sft as the train repo To Reproduce Steps to reproduce the behavior:

Go to '...'
Click on '....'
Scroll down to '....'
See error

Expected behavior A clear and concise description of what you expected to happen.

ds_report output Please run ds_report to give us details about your setup.

Screenshots If applicable, add screenshots to help explain your problem.

System info (please complete the following information):

OS: [e.g. Ubuntu 18.04]
GPU count and types [e.g. two machines with x8 A100s each]
Interconnects (if applicable) [e.g., two machines connected with 100 Gbps IB]
Python version
Any other relevant info about your setup

Launcher context Are you launching your experiment with the deepspeed launcher, MPI, or something else?

Docker context Are you using a specific docker image that you can share?

Additional context Add any other context about the problem here.

Nov 12 '24 03:11 yiyepiaoling0715

Nov 12 '24 08:11 yiyepiaoling0715

it seems that they do not implement it at all

Nov 20 '24 06:11 chengmengli06

Could some one look into this? I face the same issue.

Jan 19 '25 21:01 shiwenqin

Any progress? Same Problem.

Or is there any way to clip grads?

Apr 04 '25 23:04 Disapole-Xiao

hello, @yiyepiaoling0715 @chengmengli06 , do you have solved this?

Apr 16 '25 14:04 jiangix-paper

Same Problem

May 15 '25 08:05 SeuZL