Tongping Liu

Results 10 issues of Tongping Liu

I can't make it work even I strictly followed the guideline of README.

I noticed that ColossalAI provides few optimizers, such as 'FusedLAMB', 'FusedAdam', 'FusedSGD', 'Lamb', 'Lars', 'CPUAdam', 'HybridAdam'. These optimizers shards optimizer states based on the size of parameters and gradients. My...

**Describe the bug** After each step, when gradient_accumulation_steps is set to be 1 and in the end of each step, __reduce_and_partition_ipg_grads should be invoked to reduce the remainning gradients in...

bug
compression

It is just stucked there in the rollback phase.

Currently, some semaphores are not cleaned up if we have to rollback. We add the thread.finalize() in finalize() function. However, this finalize() function is not called in the rollback phase....

Currently, pthread_cancel will invoke a commit at first. Then it will remove itself from the alive threads. We need to think about more on this.

It has some memory segmentation fault error

I don't know the reason why it can't rollback Dedup. It is just stucked somewhere in the rollback phase of Dedup.

We can't replaying the test case of dining philosopher problem: the program will be stucked there without any progress in the replaying phase. The stucked behavior can be changed by...

FILL IN THE PR DESCRIPTION HERE FIX #xxxx (*link existing issues this PR will resolve*) **BEFORE SUBMITTING, PLEASE READ THE CHECKLIST BELOW AND FILL IN THE DESCRIPTION ABOVE** --- PR...