Crash when using `FunctionalInterrupt.h` handler
Basic Infos
- [x] This issue complies with the issue POLICY doc.
- [x] I have read the documentation at readthedocs and the issue is not addressed there.
- [ ] I have tested that the issue is present in current master branch (aka latest git).
- [x] I have searched the issue tracker for a similar issue.
- [x] If there is a stack dump, I have decoded it.
- [x] I have filled out all fields below.
Platform
- Hardware: ESP-01, but it can also be reproduced on other devices (e.g. ESP-12)
- Core Version: 3.1.1
- Development Env: Sloeber IDE, but it can also be reproduced with the Arduino IDE
- Operating System: [Windows|Ubuntu|MacOS]
Settings in IDE
- Module: Generic ESP8266 Module
- Flash Mode: DOUT
- Flash Size: 1 MB
- lwip Variant: v2 Lower Memory
- Reset Method: dtr
- Flash Frequency: 40Mhz
- CPU Frequency: 80Mhz
- Upload Using: SERIAL
- Upload Speed: 921600
Problem Description
If I use an interrupt handler that triggers frequently, the program will crash after some time. When the ESP resets after the exception, Wifi connection can no longer be established.
To reproduce, I connected an oscillator to the port where the interrupt is attached. I use GPIO 2 in this example which is available on an ESP01, but I could reproduce it with different ports, for example GPIO 3 (AKA serial RXD), or on an ESP12 with GPIO 5. I run the oscillator at about 20 Hz. The problem also occurs at lower frequencies, but it might take more time.
MCVE Sketch
#include <Arduino.h>
#include <ESP8266WiFi.h>
#include <FunctionalInterrupt.h>
unsigned long lastPrint = 0;
volatile unsigned counter = 0;
void setup() {
Serial.begin(115200);
WiFi.begin("<ssid>", "<password>"); // fill with real values
attachInterrupt(2, [&]() { ++counter; }, RISING);
}
void loop() {
auto now = millis();
if (now - lastPrint > 1000) {
Serial.println(now);
Serial.println(WiFi.status());
Serial.println(counter);
lastPrint = now;
}
delay(1);
}
Stack Trace
Exception 0: Illegal instruction
PC: 0x40201020
EXCVADDR: 0x00000000
Decoding stack results
0x40100630: interrupt_handler(void*, void*) at /home/petersohn/eclipse/Sloeber/arduinoPlugin/packages/esp8266/hardware/esp8266/3.1.1/cores/esp8266/core_esp8266_wiring_digital.cpp line 167
0x40100f06: check_poison_block(umm_block*) at /home/petersohn/eclipse/Sloeber/arduinoPlugin/packages/esp8266/hardware/esp8266/3.1.1/cores/esp8266/umm_malloc/umm_poison.c line 86
0x401014d6: umm_poison_calloc(size_t, size_t) at /home/petersohn/eclipse/Sloeber/arduinoPlugin/packages/esp8266/hardware/esp8266/3.1.1/cores/esp8266/umm_malloc/umm_poison.c line 189
0x4010056c: interrupt_handler(void*, void*) at /home/petersohn/eclipse/Sloeber/arduinoPlugin/packages/esp8266/hardware/esp8266/3.1.1/cores/esp8266/core_esp8266_wiring_digital.cpp line 138
0x40100f06: check_poison_block(umm_block*) at /home/petersohn/eclipse/Sloeber/arduinoPlugin/packages/esp8266/hardware/esp8266/3.1.1/cores/esp8266/umm_malloc/umm_poison.c line 86
0x40100f06: check_poison_block(umm_block*) at /home/petersohn/eclipse/Sloeber/arduinoPlugin/packages/esp8266/hardware/esp8266/3.1.1/cores/esp8266/umm_malloc/umm_poison.c line 86
0x40101160: umm_malloc_core(umm_heap_context_t*, size_t) at /home/petersohn/eclipse/Sloeber/arduinoPlugin/packages/esp8266/hardware/esp8266/3.1.1/cores/esp8266/umm_malloc/umm_local.c line 47
0x401003d0: ets_post(uint8, ETSSignal, ETSParam) at /home/petersohn/eclipse/Sloeber/arduinoPlugin/packages/esp8266/hardware/esp8266/3.1.1/cores/esp8266/core_esp8266_main.cpp line 238
0x40101490: umm_malloc(size_t) at /home/petersohn/eclipse/Sloeber/arduinoPlugin/packages/esp8266/hardware/esp8266/3.1.1/cores/esp8266/umm_malloc/umm_malloc.cpp line 912
0x40212b9d: sys_timeout_abs at core/timeouts.c line 189
0x401003d0: ets_post(uint8, ETSSignal, ETSParam) at /home/petersohn/eclipse/Sloeber/arduinoPlugin/packages/esp8266/hardware/esp8266/3.1.1/cores/esp8266/core_esp8266_main.cpp line 238
0x4020346a: loop_task(ETSEvent*) at /home/petersohn/eclipse/Sloeber/arduinoPlugin/packages/esp8266/hardware/esp8266/3.1.1/cores/esp8266/core_esp8266_main.cpp line 273
0x40100094: app_entry() at /home/petersohn/eclipse/Sloeber/arduinoPlugin/packages/esp8266/hardware/esp8266/3.1.1/cores/esp8266/core_esp8266_main.cpp line 392
Debug Messages
The program runs fine for about 10 minutes on my scenario (it's random, and occurs earlier if the oscillator runs at higher frequency). Then I get an exception (stack trace above).
Fatal exception 0(IllegalInstructionCause):
epc1=0x40201020, epc2=0x00000000, epc3=0x00000000, excvaddr=0x00000000, depc=0x00000000
Then the ESP reboots, but the Wifi connection cannot be established.
scandone
no <ssid> found, reconnect after 1s
wifi evt: 1
STA disconnect: 201
reconnect
Can you try with your handler specifically declared in IRAM:
IRAM_ATTR void handler ()
{
counter++;
}
The & is superfluous and is also in the way preventing for the lambda to be of the right type to be placed in IRAM by the linker.
But without the & there's an ambiguity that we seemingly should solve, with [](){counter++;} instead I get:
intr/intr.ino:13:49: error: call of overloaded 'attachInterrupt(int, setup()::<lambda()>, int)' is ambiguous
cores/esp8266/Arduino.h:186:6: note: candidate: 'void attachInterrupt(uint8_t, void (*)(), int)'
cores/esp8266/FunctionalInterrupt.h:31:6: note: candidate: 'void attachInterrupt(uint8_t, std::function<void()>, int)'
@d-a-v Not pitching my PR - OK, just a little bit - but I remembered that I had done work on FunctionalInterrupt etc and I've verified that #6047 fixed the ambiguity error you found. If you consider it, I would be grateful for a careful review that I haven't cluelessly introduced any out-of-IRAM issues.
I could not reproduce the issue with the native handler. However, in my application where the problem comes from, I tried getting rid of the FunctionalInterrupt, but the problem still persists. I'm still trying to come up with a minimal reproduction, but it looks like the problem is not with FunctionalInterrupt, it just gets the problem manifest.
@petersohn Not saying it's related to your crash, but you should call pinMode(2, INPUT);
Oscillator at high(er) frequency - are you perhaps triggering more interrupts than the MCU can handle without important internal functions beginning to fail?
Yes, I missed the pinMode() call, but putting it in doesn't help.
I could reproduce the issue without FunctionalInterrupt, by using a simple function call.
void interruptHandlerImpl() {
++counter;
}
IRAM_ATTR void interruptHandler() {
interruptHandlerImpl();
}
@petersohn Every bit of code that may run inside the interrupt service routine must be in IRAM.
If that's true, it makes the bug in FunctionalInterrupt obvious, which is weird at the very least, because FunctionalInterrupt has been part of the core for a long time, which makes it strange nobody has ever noticed that it should not work.
@petersohn
If that's true
It is a well documented fact.
What about the question that your interrupt frequency is just to high to handle?
It is a well documented fact.
Then why is FunctionalInterrupt there? It should never have been able to work.
What about the question that your interrupt frequency is just to high to handle?
In the real application, the problem comes up regardless of interrupt frequency. It just takes more time to reproduce. I used the frequency I did because it can reproduce the problem relatively fast.