nuttx icon indicating copy to clipboard operation
nuttx copied to clipboard

Fix task_setup race condition

Open Gary-Hobson opened this issue 1 year ago • 5 comments

Summary

When a large number of threads are created in SMP, malloc in nxtask_assign_pid may cause thread switching and create new threads. This may lead to accessing invalid pointers.

Impact

Testing

Gary-Hobson avatar Aug 23 '24 08:08 Gary-Hobson

@Gary-Hobson is there some way to trigger the race condition before this patch? I think it could be important to create a ostest avoid this issue happening again in the future

acassis avatar Aug 23 '24 14:08 acassis

@anchao Do you mean leave_critical_section before all malloc/free?

This has been modified, but the functionality of this method seems to have no difference Because of the check, calling malloc/free in the critical section will not have any effect

Gary-Hobson avatar Aug 29 '24 03:08 Gary-Hobson

@anchao Do you mean leave_critical_section before all malloc/free?

This has been modified, but the functionality of this method seems to have no difference Because of the check, calling malloc/free in the critical section will not have any effect

In SMP, holding critical_section and then holding mm_lock may cause ABA or ABBA deadlock. For example, critical_section is used in the lower half of mm implementation:

enter_critical_section(spin_lock)
mm_lock
enter_critical_section(spin_lock) <-- deadlock

Please make sure that there is no habit of repeated locking in the code path, which may cause serious issues.

anchao avatar Aug 29 '24 04:08 anchao

@anchao Do you mean leave_critical_section before all malloc/free? This has been modified, but the functionality of this method seems to have no difference Because of the check, calling malloc/free in the critical section will not have any effect

In SMP, holding critical_section and then holding mm_lock may cause ABA or ABBA deadlock. For example, critical_section is used in the lower half of mm implementation:

enter_critical_section(spin_lock)
mm_lock
enter_critical_section(spin_lock) <-- deadlock

Please make sure that there is no habit of repeated locking in the code path, which may cause serious issues.

Thanks for the explanation, I will pay attention

Gary-Hobson avatar Aug 29 '24 06:08 Gary-Hobson

@Gary-Hobson is there some way to trigger the race condition before this patch? I think it could be important to create a ostest avoid this issue happening again in the future

I tried it, and this case is not easy to add to ostest

It only happens occasionally when a large number of threads are created and destroyed I can't think of any way to make it happen every time without hacking the kernel code

We can reproduce it with lots of stress testing, but it takes a long time

Gary-Hobson avatar Aug 29 '24 07:08 Gary-Hobson