trampoline icon indicating copy to clipboard operation
trampoline copied to clipboard

Problems with Type2 interrupts

Open jmbraben opened this issue 4 years ago • 17 comments

We're using Trampoline in a reference design on a Cortex-R7 and are encountering an issue that would seem common to all machines. Our type2 model:

IRQ handler

  • Calls the tpl_central_interrupt_handler(id)
  • Disables the GIC interrupt for this source

Type2 Handler is scheduled

  • Services the hardware (clearing interrupt source)
  • Re-Enables the GIC interrupt for this source
  • terminates

The problems is if the interrupt source has another event become pending when we re-enable the GIC, it immediately interrupts (as it should). But we are still "running" the Type2 handler that the tpl_central_interrupt_handler(id) will want to start. When tpl_activate_isr is called, it sees isr is "already running" and just exits. IRQ returns, handler is resumed (at it's tail), exits...but is then "done" (nothing has asked it to schedule it again, and the GIC had disabled this event source).

The question is "where to reenable the GIC after type2 ISR finishes?" such that this problem is avoided.

  • The place would seem to be tpl_terminate_isr2_service (after the tpl_terminate(); call)
  • Currently there is no port level "callback" in that (terminate isr2) area (I could obviously make one)

Do you see a better solution to avoid this problem?

jmbraben avatar Feb 17 '22 22:02 jmbraben

Ok...I'm seeing that tpl_terminate_isr2_service calls tpl_terminate(); So this means I can use the post_task_hook. We'll move the vector re-enable to the hook and see if that resolves the problem.

jmbraben avatar Feb 18 '22 19:02 jmbraben

Hello,

ISR2s are closer to basic tasks than to ISR1s. They are scheduled like tasks. The interrupt handler shall have the same architecture as the tpl_sc_handler. It must:

  1. Call the tpl_central_interrupt_handler with the id of the is as argument (r0)
  2. Check the need_switch flag in tpl_kern
  3. If need_switch is 1, saves the context of the interrupted task and load the context of the ISR2
  4. returns from the handler

On return the ISR2 starts its execution.

Don't hesitate to ask for more information

Best regards

jlbirccyn avatar Feb 18 '22 19:02 jlbirccyn

Thank you, perhaps you can clarify a confusion point This Autosar Document section 6.2.2 and figure on page 14 are indicating that Category 2 interrupts are running in the interrupt context.

The "ISR2" of trampoline are basically tasks running in user context (and scheduled by the tpl_central_interrupt_handler/tpl_activate_isr...correct?

In the Autosar document, the "ISR/Interrupt Handler" is running in the interrupt context with no real mention of a handler running in user context. Also, they are indicating that the handler runs to completion prior to return from interrupt. The Autosar Cat2 handler would be (maybe) equivalent to the tpl_central_interrupt_handler?

Is there any equivalent in the Autosar documentation to the Trampoline ISR2? Or is ISR2 considered an "implementation detail"? or?

I'm just trying understand the relationship between the Autosar spec and the Trampoline implementation. Thanks

jmbraben avatar Feb 23 '22 19:02 jmbraben

Hello jlbirccyn,

the problem me and Jon have is: We define our IRQ handlers as type 2 IRQ in the oil file. Therefore our handler gets kicked as a task like object. While the ISR is kicked the corrosponding interrupt at the CPU is disabled but the interrupt is not cleared and still pending. After during the ISR execution the IRQ is cleared. At the end of the IRQ handler the IRQ at the CPU is reenabled and the ISR is terminated.

If during execution of the ISR a new IRQ is issued it would not be executed because interrupt is disabled right now at the CPU. Right after re-nabling of the interrupt, the interrupt would be fired again. It would try to schedule the IRQ again but it is not fully terminated, as the IRQ is issued very fast after reenabling of the interrupt. As all ISR category2 have a max activation count of 1, which means they can be scheduled only one time while they are running. it would simply ignore the new interrupt and exit.

Setting the ACTIVATION = 2 for the ISR would solve the problem, but from Oil 2.5 specification it looks like it is not possible to set this for the ISR. We have to reenable the IRQ outside of the handler, e.g. in tpl_terminate in the CALL_POST_TASK_HOOK() but it is not clear if that is late enough as the task state is not suspended at this point.

I checked the trampoline manual but it has no proper recommendation about this.

AUTOSAR specification states renable of the IRQ must be done on OS level and not on ISR level.

Being able to set ISR max activation to 2 would solve the problem but oil 2.5 does not specifiy it, so I do not see a way to do this clean. This problem does not occur if IRQ do never happen in a very fast succession.

Maybe you have some idea how to properly solve the problem.

BR Sven Grundmann

grundmanns avatar Mar 08 '22 16:03 grundmanns

Hello, I just checked tpl_terminate has the irq disabled during the whole function. This means enabling the one IRQ in the post task hook is a clean option.

BR Sven Grundmann

grundmanns avatar Mar 08 '22 19:03 grundmanns

I cannot enable the IRQ in the PostTaskHook() as GetISRID is not working. I saw that it is implemented but it is not added to the project system call dispatcher. It seems to be kind of extended system call. Therefore I can not determine the current ID in a ISR and therefore not enable the correct IRQ in the PostTaskHook(). Do I have to set something to the oil file to enable the extended system call ?

grundmanns avatar Mar 11 '22 22:03 grundmanns

I solved the problem by patching isr goil template to have a maximum activation of 2 in the ISR. Then during nested IRQ the ISR is rescheduled for processing:

diff --git a/goil/templates/config/config.oil b/goil/templates/config/config.oil index 5d2d057..8f97c4b 100755 --- a/goil/templates/config/config.oil +++ b/goil/templates/config/config.oil @@ -261,6 +261,7 @@ IMPLEMENTATION trampoline_common { UINT32 [1, 2] CATEGORY; UINT32 PRIORITY; /* Trampoline extra */ RESOURCE_TYPE RESOURCE[];

  • UINT32 ACTIVATION = 2; MESSAGE_TYPE MESSAGE[]; }; diff --git a/goil/templates/code/isr_descriptor.goilTemplate b/goil/templates/code/isr_descriptor.goilTemplate index 586efef..81394e1 100755 --- a/goil/templates/code/isr_descriptor.goilTemplate +++ b/goil/templates/code/isr_descriptor.goilTemplate @@ -84,7 +84,7 @@ if OS::NUMBER_OF_CORES > 1 then% %

/* ISR base priority */ % !isr::PRIORITY %,

  • /* ISR activation count */ 1,
  • /* ISR activation count / % !isr::ACTIVATION %, / ISR type */ IS_ROUTINE, #if WITH_AUTOSAR_TIMING_PROTECTION == YES %

grundmanns avatar Mar 11 '22 22:03 grundmanns

Hello,

Sorry for not being more responsive but I'm swamped with work at the moment.

A different way to solve the problem would be to generate a fonction to acknowledge the interrupt. It would be called in the tpl_terminate_isr service. A table of function pointers, one per ISR2 would be generated too and would be indexed by the ISR id minus TASK_COUNT to retrieve the acknowledge function.

jlbirccyn avatar Mar 14 '22 11:03 jlbirccyn

GetISRID is available in AUTOSAR

jlbirccyn avatar Mar 14 '22 11:03 jlbirccyn

Hello jlbirccyn, thank you for your responses. You mean to extend FUNC(tpl_status, OS_CODE) tpl_terminate_isr2_service(void) to add a table with ISR2 acknowledge function. This would exactly solve the problem as long as tpl_terminate_isr2_service and tpl_terminate cannot be interrupted by enabling the interrupt, as i can see that after reenabling the IRQ the pending IRQ is executed 9 clocks after enabling of the IRQ. I saw that tpl_terminate has IRQ disabled I guess as it is part of the kernel, so I guess in tpl_terminate_isr2_service it is the same as it is the calling function of tpl_terminate and part of the system call.

I am try to check how professional AUTOSAR solves this problem, as they will have the same problem too.

Would you encourage us to modify the tpl_terminate_isr2_service for that ? I wanted to be as least intrusive for that as possible. This is why i wanted to use the PostTaskHook.

Unfortunately I could not use GetISRID. If I check tpl_invoque.S in my build directory and it does not contain it. Do I have to activate it with some command in the oil file ? egrep -nR define tpl_service_ids.h /dev/null tpl_service_ids.h:28:#define TPL_SERVICES_IDS_H tpl_service_ids.h:42:#define OSServiceId_ResumeOSInterrupts 0 tpl_service_ids.h:50:#define OSServiceId_EnableAllInterrupts 1 tpl_service_ids.h:58:#define OSServiceId_SuspendOSInterrupts 2 tpl_service_ids.h:66:#define OSServiceId_DisableAllInterrupts 3 tpl_service_ids.h:74:#define OSServiceId_CallTerminateISR2 4 tpl_service_ids.h:82:#define OSServiceId_SuspendAllInterrupts 5 tpl_service_ids.h:90:#define OSServiceId_CallTerminateTask 6 tpl_service_ids.h:98:#define OSServiceId_ResumeAllInterrupts 7 tpl_service_ids.h:106:#define OSServiceId_Schedule 8 tpl_service_ids.h:114:#define OSServiceId_ReleaseResource 9 tpl_service_ids.h:122:#define OSServiceId_ShutdownOS 10 tpl_service_ids.h:130:#define OSServiceId_SetEvent 11 tpl_service_ids.h:138:#define OSServiceId_ActivateTask 12 tpl_service_ids.h:146:#define OSServiceId_ClearEvent 13 tpl_service_ids.h:154:#define OSServiceId_ChainTask 14 tpl_service_ids.h:162:#define OSServiceId_GetEvent 15 tpl_service_ids.h:170:#define OSServiceId_GetTaskID 16 tpl_service_ids.h:178:#define OSServiceId_WaitEvent 17 tpl_service_ids.h:186:#define OSServiceId_StartOS 18 tpl_service_ids.h:194:#define OSServiceId_GetAlarmBase 19 tpl_service_ids.h:202:#define OSServiceId_GetActiveApplicationMode 20 tpl_service_ids.h:210:#define OSServiceId_GetAlarm 21 tpl_service_ids.h:218:#define OSServiceId_CancelAlarm 22 tpl_service_ids.h:226:#define OSServiceId_SetRelAlarm 23 tpl_service_ids.h:234:#define OSServiceId_GetResource 24 tpl_service_ids.h:242:#define OSServiceId_SetAbsAlarm 25 tpl_service_ids.h:250:#define OSServiceId_GetTaskState 26 tpl_service_ids.h:258:#define OSServiceId_TerminateTask 27 tpl_service_ids.h:265:#define SYSCALL_COUNT 28 tpl_service_ids.h:266:#define SYSCALL_COUNT_ISR1 8

BR Sven Grundmann

grundmanns avatar Mar 15 '22 12:03 grundmanns

Hello grundmanns

Yes, this is what I mean and I welcome external help 👍 I can do the table generation.

I confirm the *_service functions are executed in kernel mode. At this point the interrupt priority is set at the system call handler priority so ISR2 cannot interrupt the kernel (as long as hardware priorities of ISR2s are ≤)

I will check the problem with GetISRID but not before Thursday

Best regards

Jean-Luc Béchennec

jlbirccyn avatar Mar 15 '22 13:03 jlbirccyn

Hello Jean-Luc,

thank you for this offer. It would be similar to the ClearFlag function only that clear flag function is called after the first level IRQ handler and ISR2 ClearFlag would be called after ISR2 is finished. This would clearly solve the problem.

Right now I solved it by modifying the config.oil and the ISR template oil to accept max activation count of 2 for an ISR. Then another ISR2 execution is queued in case IRQ happened during ISR2 execution. Unfortunately oil2.5 does not cover this, so I guess it is not clean. I am preferring a clean solution, which means your proposal is attractive for me.

This means you would generate a table containing the ISR2 end ISR2 functions and I just would have to populate it by defining the functions. This would generate a incompatibility for people not using this feature right ? This would have to be optional and deactivated by default, right ? If for every ISR a dedicated function is called then I do not need to access the GetISRID function, as the function would already know, which thread it belongs to.

I am checking in parallel how this is handled by other AUTOSAR solution.

BR Sven Grundmann

grundmanns avatar Mar 15 '22 15:03 grundmanns

Hello Sven.

This morning I did the following:

  1. created a new branch called isr_ack. It is dedicated to the modifications for you. So do them on this branch and do a pull-request at the end.
  2. Added a boolean attribute in OIL OS object. This attribute is ACK_ISR2_ON_TERMINATE and is FALSE by default. On the C side, there is no a define of WITH_ACK_ISR2_ON_TERMINATE which can be YES or NO. Writing ACK_ISR2_ON_TERMINATE = TRUE; in the OS object of the OIL file will generate a #define WITH_ACK_ISR2_ON_TERMINATE YES.
  3. If ACK_ISR2_ON_TERMINATE is TRUE, a table called tpl_isr_ack_table is generated in tpl_app_config.c file. To access this table it is necessary to use an index which is the identifier of the ISR minus TASK_COUNT. Each element of the table is a function pointer to the ack function that shall be generated or written (I can help too). The ack function of an ISR2 named dummy is named dummy_ack_function.

Feel free to ask more informations

jlbirccyn avatar Mar 17 '22 11:03 jlbirccyn

Hello Jean-Luc, we mirrored you branch into our gitlab server. I will check out your solution during that week. Right now i am trying to find out how this ISR2 problem is handled solved in a commercial AUTOSAR solution as I want to have a similar solution. We ported our software to trampoline to be able to support customers that use a commercial AUTOSAR solution and have an easy way for them to port it.

We use the cortex-r machine as a basis and I modified it to be able to use OS aware debugging and tracing with ORTI file generation using the Lauterbach debugger. It is not yet 100% done yet and need some work by me.

If the legal aspect is ok I would feedback the changes to support Lauterbach using ORTI file to you. I also patched the kernel to support OTM trace protocol. This means the process id is written to the context id register and passed to the debugger without using data trace.

BR Sven Grundmann

grundmanns avatar Mar 23 '22 11:03 grundmanns

Hello Sven

Thanks for the contribution. Do a pull request when everything is ok on your side.

Best regards

Jean-Luc Béchennec

jlbirccyn avatar Mar 29 '22 07:03 jlbirccyn

Hello Jean-Luc, I had a discussion with a professional AUTOSAR vendor and in their system the ISR2 is not a task scheduled by the OS and the CPU irq is diabled by the OS before the ISR2 handler and reactivated by the OS after. This means if our software is ported by our to this professional system there would be no problem. Therefore I for the time being postponed the PostISR2 solution by simply setting the activation count to 2 for all ISR2 in the goil template.

I have still one question regarding this topic, which I asked before. Why is GetISRID not imported into our project ? It seems to be there in the sourcecode but it is not in the auto generated system call table of our project. Do I have to activate it somehow with some oil instruction ?

BR Sven Grundmann

grundmanns avatar Apr 21 '22 12:04 grundmanns

Hello Jean-Luc, i found the answer myself. I have to set OIL_VERSION="4.0" instead of 2.5 and set SCALABILITYCLASS=SC4 Unfortunately SC3 & SC4 did not compile for me I had to introduce an application containing all task and then it worked. I still have to work out how to enable the other apis, like message api.

grundmanns avatar Apr 21 '22 17:04 grundmanns