Fix task_setup race condition
Summary
When a large number of threads are created in SMP, malloc in nxtask_assign_pid may cause thread switching and create new threads. This may lead to accessing invalid pointers.
Impact
Testing
@Gary-Hobson is there some way to trigger the race condition before this patch? I think it could be important to create a ostest avoid this issue happening again in the future
@anchao Do you mean leave_critical_section before all malloc/free?
This has been modified, but the functionality of this method seems to have no difference Because of the check, calling malloc/free in the critical section will not have any effect
@anchao Do you mean leave_critical_section before all malloc/free?
This has been modified, but the functionality of this method seems to have no difference Because of the check, calling malloc/free in the critical section will not have any effect
In SMP, holding critical_section and then holding mm_lock may cause ABA or ABBA deadlock. For example, critical_section is used in the lower half of mm implementation:
enter_critical_section(spin_lock)
mm_lock
enter_critical_section(spin_lock) <-- deadlock
Please make sure that there is no habit of repeated locking in the code path, which may cause serious issues.
@anchao Do you mean leave_critical_section before all malloc/free? This has been modified, but the functionality of this method seems to have no difference Because of the check, calling malloc/free in the critical section will not have any effect
In SMP, holding critical_section and then holding mm_lock may cause
ABAorABBAdeadlock. For example, critical_section is used in the lower half of mm implementation:enter_critical_section(spin_lock) mm_lock enter_critical_section(spin_lock) <-- deadlockPlease make sure that there is no habit of repeated locking in the code path, which may cause serious issues.
Thanks for the explanation, I will pay attention
@Gary-Hobson is there some way to trigger the race condition before this patch? I think it could be important to create a ostest avoid this issue happening again in the future
I tried it, and this case is not easy to add to ostest
It only happens occasionally when a large number of threads are created and destroyed I can't think of any way to make it happen every time without hacking the kernel code
We can reproduce it with lots of stress testing, but it takes a long time