notification for a switch context without verifying the need_switch
Directly after calling the function “tpl_schedule_from_running()” in the function “tpl_activate_task_service()”, we call the macro “SWITCH_CONTEXT” without verifying the need_switch (line 91 in tpl_os_task_kernel.c).
In case of multicore, calling this macro leads towards “tpl_send_intercore_it” function (line 403 in tpl_os_kernel.h). Therefore, we risk sending an interrupt for context switch to another core without necessarily needing to switch the context, since we don’t verify the need_switch boolean before.
Hello !
All the following definitions are located in the os/tpl_os_kernel.h, lines 333 to 413. Here's a short explanation of the process.
Monocore
The macro SWITCH_CONTEXT is expansed to LOCAL_SWITCH_CONTEXT (tpl_os_kernel.h:369), which is then always expansed to :
if(TPL_KERN(core_id).need_switch != NO_NEED_SWITCH)
{
/* Do the switch context */
}
Multicore
It's a bit more tricky but still follows the same logic.
In the SWITCH_CONTEXT macro, we're checking if we want to do the context switch in the calling core or in another core with the following test :
if( a_core_id == tpl_get_core_id)
If it is the calling core, we're calling LOCAL_SWITCH_CONTEXT (same behaviour as in monocore). If not, we're calling the macro REMOTE_SWITCH_CONTEXT which will do the following :
if(TPL_KERN(core_id).need_switch != NO_NEED_SWITCH)
{
/* Send the intercore interrupt */
}
Either way, the need_switch flag is always verified. I don't have the tools to test this here, but you may try this :
Core 0 :
Task0(AutoStart) :
while(1)
{
ActivateTask(Task1);
}
Core1 :
Task1
while(1);
Then by checking the interrupts, the core 0 should not send more than one intercore interrupt as the core 1 is always in the Task1's priority => no need switch should stay at 0. Check ActivateTask's return value and enable OS_EXTENDED mode so you're sure the task activation has been done properly.
Edit : Ah sorry, didn't see the WITH_SYTEM_CALL test !
When the SYSTEM_CALL flag is enabled, the check of the need_switch variable is done at the end of the syscall/interrupt handler. So in fact, the intercore interrupt is always done even if a contex switch is not needed.
Instead of calling tpl_send_intercore_it(a_core_id), we could just call the REMOTE_SWITCH_CONTEXT definition :
400 /* WITH_SYSTEM_CALL == YES */
401
402 #define SWITCH_CONTEXT(a_core_id) \
403 if (a_core_id != tpl_get_core_id()) { \
404 REMOTE_SWITCH_CONTEXT(a_core_id); \
405 }
406 #define SWITCH_CONTEXT_NOSAVE(a_core_id) \
407 if (a_core_id != tpl_get_core_id()) { \
408 REMOTE_SWITCH_CONTEXT(a_core_id); \
409 }
410
Good call :)
Yes, you were right ! I've just updated the answere above.
In fact, when the system calls are enabled, the checking of the need_switch flag is done, but way later, before returning from the interrupt / syscall (In powerpc : file machines/ppc/multicore/tpl_it_handler.s:256 for instance). So there is in fact a performance issue caused by the excessive intecore interrupts in muticore you had pointed out.
The code i put above should resolve the issue, however better launch the multicore tests before pushing (and i don't have the tools to do so here !).
Hello,
Thank you for the detailed explanation. It seems logical now :).
Actually I thought that we dont check the need_switch flag before, because I was examining the definition of SWITCH_CONTEXT macro in case of WITH_SYSTEM_CALL == YES (tpl_os_kernel.h line 400). In fact, I just verified this file, and we call directly tpl_send_intercore_it(a_core_id);
`#if WITH_SYSTEM_CALL == NO
#define SWITCH_CONTEXT(a_core_id)
if (a_core_id == tpl_get_core_id()) {
LOCAL_SWITCH_CONTEXT(a_core_id)
}
else {
REMOTE_SWITCH_CONTEXT(a_core_id);
}
#define SWITCH_CONTEXT_NOSAVE(a_core_id)
if (a_core_id == tpl_get_core_id()) {
LOCAL_SWITCH_CONTEXT(a_core_id)
}
else {
REMOTE_SWITCH_CONTEXT(a_core_id);
}
#else /* WITH_SYSTEM_CALL == YES */
#define SWITCH_CONTEXT(a_core_id)
if (a_core_id != tpl_get_core_id()) {
tpl_send_intercore_it(a_core_id);
}
#define SWITCH_CONTEXT_NOSAVE(a_core_id)
if (a_core_id != tpl_get_core_id()) {
tpl_send_intercore_it(a_core_id);
}
#endif`
Okey, sorry I just saw the update. Thank you for your answear :D
The problem seems not to be a problem ☺️. Il close the issue.