NG: Unable to communicate over SPI
Checklist
- [X] Checked the issue tracker for similar issues to ensure this is not a duplicate
- [X] Read the documentation to confirm the issue is not addressed there and your configuration is set correctly
- [X] Tested with the latest version to ensure the issue hasn't been fixed
How often does this bug occurs?
always
Expected behavior
I expected to have SPI communication work as before.
Actual behavior (suspected bug)
I started experiencing SPI communication issues since I have updated my codebase from the current master branch. The problem is that the NG host driver can't get bootup event anymore. I investigated the issue with logic analyzer and what seems strange to me is that ESP does not rise DATA-READY event before sending bootup packet. This is most likely the reason why IRS isn't called and the packet doesn't arrive. I verified all physical connections again and they all look fine. Could you get an advice what I should try further, please.
Error logs or terminal output
root@nanopi-a64-bookworm:~# dmesg | grep esp
[ 6.655317] esp32_spi: esp_spi_probe: Probing ESP32 SPI-driver...
[ 6.669734] esp32_spi: esp_spi_probe: raw-tp-mode was not specified
[ 6.683096] esp32_spi: esp_spi_probe: ESP32 peripheral is registered to SPI bus [0], chip select [0], SPI Clock [1], SPI mode [0x02]
Steps to reproduce the behavior
NG host driver with ESP32
Project release version
latest
System architecture
other (details in Additional context)
Operating system
Linux
Operating system version
Debian Bookworm
Shell
Bash
Additional context
Allwinner A64 (sun50i) / NanoPi A64
- First transaction is host driven always. It is not dependent upon data ready. Data ready comes in picture for second onwards transaction.
- Is is code commented yet: https://github.com/espressif/esp-hosted/pull/425/files#diff-376c997542ed8caae30ab0a0d30cbff66a894804f1e92e8dcee9b1cc6aa3fcfdL963 ?
@mantriyogesh Do you mean hardware reset ?
https://github.com/sergey-suloev/esp-hosted/blob/8331c2ee5ae5fa3436beff39bb2bc7bfff957859/esp_hosted_ng/host/spi/esp_spi.c#L526
The hardware reset happens as usual in my hardware , I can see it from the led blinking. Again, I haven't changed my spi driver code , I only took your updates made in other host files + firmware.
I am not sure what you mean by "host driven". According to the code logic, the incoming data are only processed when queue_work is called and it is only called from ISRs. Another thing is that 2 attempts are made by ESP to send bootup message(you can see this from capture). And the second attempt is ignored too while data ready interrupt is triggered this time .
I think it would be better to try this with out modifying latest master code.
The sequence of the steps matter a lot
Did you get a chance to try the vanilla latest master?
Did you get a chance to try the vanilla latest master?
It is hard to integrate your driver in my system , because it is written old-style without device tree support. Do you see something wrong in my adaptation ?
In theory , I can try your driver but I need to find out GPIO numbers . I don't know how to do that at the moment.
root@nanopi-a64-bookworm:~# gpioinfo gpiochip0 - 256 lines: .... line 234: unnamed "handshake" input active-high [used] line 235: unnamed "dataready" input active-high [used]
Are the numbers 234 and 235 those you need for your driver ?
As you correctly pointed, we need to move to newer way to support device tree. however, we also need to ensure that we also support these using module params or hardcoding, which is currently supported in FG driver.
Worth to check what raspberry pi os you use : https://github.com/espressif/esp-hosted/blob/master/esp_hosted_fg/docs/Linux_based_host/Troubleshoot.md#42-reset-pin-and-spi-gpio-numbering
For newer Raspberry Pi OS : https://github.com/espressif/esp-hosted/blob/ce00ca53189ffd52455dd5426d54f9be6fae1871/esp_hosted_fg/host/linux/host_control/rpi_init.sh#L13-L28
For older Raspberry Pi OS: https://github.com/espressif/esp-hosted/blob/ce00ca53189ffd52455dd5426d54f9be6fae1871/esp_hosted_fg/host/linux/host_control/rpi_init.sh#L33-L36
Are the numbers 234 and 235 those you need for your driver?
No. But then I am not sure which GPIOs do you use to connect. ideally, same GPIO numbers you can use.
As you are already aware, the Slave Reset, Data Ready and Handshake GPIOs could be freely chosen by user. just make sure they are not already used.
I am not using rpi but allwinner platform . There is another napping scheme from platform specific numbers to universal gpio numbers . The numbers I sent you are probably not correct because they look like indexes within a specific gpio controller
understood.. The numbers are linear. abstraction to the actual controller.
There's probably a way to get the correct numbers using API and log them , I'll see if I can do it
I also don't wish to alarm you unnecessarily, but in past, in some specific allwinner board (t507) , we had to introduce some extra logic transition.
We observed that the spi driver was not processing every alternate message due to issue in spi linux driver.
[ ! IMPORTANT ]
This is based on some old commit. You cannot directly apply this change on your repo.
This is internal MR that we had, it has huge unnecessary changes for you. You can only focus on one file, network_adapter/main/spi_slave_api.c
check symbols, //reset_handshake_gpio();, wait_cs_deassert_sem and gpio_set_intr_type(gpio_num, GPIO_INTR_ANYEDGE);
ignore pkt_num change. if you add pkt_num, please ensure you use at both sides, esp and host. as it is changing header.
Please disregard this, if you do not face any packet drop as such.
Some changes made recently causes my issue . It was working well for a long time before merging with latest master
can you please only point me to the latest commit that caused this issue? it might be worth to see the specific change, instead of complete driver.
@mantriyogesh could you provide some help on how to compile your driver, please ?
Actually, your code much resembles to FG, you have many FG code inherited.
https://github.com/sergey-suloev/esp-hosted/blob/8331c2ee5ae5fa3436beff39bb2bc7bfff957859/esp_hosted_fg/host/linux/host_control/rpi_init.sh#L431
Just change the rpi_init.sh for correct GPIOs and run it without args, it would print usage.
I was able to build and run the driver, here is my log
modprobe esp32_spi resetpin=101 clockspeed=10
dmesg
[ 243.462435] esp32_spi: loading out-of-tree module taints kernel. [ 243.677432] esp32_spi: spi_dev_init: Failed to obtain SPI master handle [ 244.096684] WARNING: Flushing system-wide workqueues will be prohibited in near future. [ 244.104784] CPU: 1 PID: 267 Comm: modprobe Tainted: G O 6.10.10-sunxi #stable [ 244.113330] Hardware name: FriendlyARM NanoPi A64 (DT) [ 244.118484] Call trace: [ 244.120941] dump_backtrace+0x94/0x114 [ 244.124715] show_stack+0x18/0x24 [ 244.128046] dump_stack_lvl+0x74/0x8c [ 244.131727] dump_stack+0x18/0x24 [ 244.135056] __warn_flushing_systemwide_wq+0x1c/0x28 [ 244.140041] spi_exit+0x80/0x250 [esp32_spi] [ 244.144370] esp_init_interface_layer+0x398/0x648 [esp32_spi] [ 244.150163] esp_init+0x14c/0x1000 [esp32_spi] [ 244.154655] do_one_initcall+0x44/0x28c [ 244.158508] do_init_module+0x60/0x21c [ 244.162279] load_module+0x1e98/0x2088 [ 244.166048] init_module_from_file+0x88/0xcc [ 244.170338] __arm64_sys_finit_module+0x244/0x33c [ 244.175064] invoke_syscall+0x48/0x114 [ 244.178835] el0_svc_common.constprop.0+0x40/0xe0 [ 244.183561] do_el0_svc+0x1c/0x28 [ 244.186895] el0_svc+0x34/0xd8 [ 244.189968] el0t_64_sync_handler+0x120/0x12c [ 244.194342] el0t_64_sync+0x190/0x194 [ 244.199802] esp32_spi: spi_init: Failed Init SPI device
I think this happens because SPI driver (spi-sun6i) isn't loaded . In my system SPI driver is loaded by device tree.
Ok, I loaded SPI driver and now your driver works. But still nothing happens. I can't receive bootup event. Hardware reset works well, I can see led blinking.
root@nanopi-a64-bookworm:~# dmesg | grep esp [ 45.923024] esp32_spi: loading out-of-tree module taints kernel. [ 46.139543] esp32_spi: spi_dev_init: ESP32 peripheral is registered to SPI bus [0],chip select [0], SPI Clock [10]
Here is how I found out the gpio numbers using my driver : esp_info("Acquired GPIO RESET: %d\n", desc_to_gpio(spi_ctx->reset)); esp_info("Acquired GPIO Handshake: %d\n", desc_to_gpio(spi_ctx->handshake)); esp_info("Acquired GPIO Data-Ready: %d\n", desc_to_gpio(spi_ctx->data_ready));
[ 6.829650] esp32_spi: esp_spi_probe: Acquired GPIO RESET: 101 [ 6.844312] esp32_spi: esp_spi_probe: Acquired GPIO Handshake: 234 [ 6.860157] esp32_spi: esp_spi_probe: Acquired GPIO Data-Ready: 235
And modified your driver :
#define HANDSHAKE_PIN 234 //22 #define SPI_IRQ gpio_to_irq(HANDSHAKE_PIN) #define SPI_DATA_READY_PIN 235 //27 #define SPI_DATA_READY_IRQ gpio_to_irq(SPI_DATA_READY_PIN) #define SPI_BUF_SIZE 1600
And loaded the driver:
modprobe esp32_spi resetpin=101 clockspeed=10
Firmware loaded with same variant? On same commit?
Firmware loaded with same variant? On same commit?
Yes, sure
Let me know if you face issue on FG also.
Btw you claim that NG works with Raspberry Pi, I tried and found that it works a little better - it can receive bootup event, but bluetooth doesn't work either.
root@rpi3-aplus-arm64-bookworm:~# dmesg | grep esp [ 7.838095] esp32_spi: esp_init: esp_init: begin [ 7.845239] esp32_spi: esp_spi_probe: Probing ESP32 SPI-driver... [ 7.859765] esp32_spi: esp_spi_probe: raw-tp-mode was not specified [ 7.861923] esp32_spi: esp_spi_probe: ESP32 peripheral is registered to SPI bus [0], chip select [0], SPI Clock [1], SPI mode [0x02] [ 8.300393] esp32_spi: esp_spi_probe: Probe success. [ 10.762062] esp32_spi: esp_init: esp_init: end 0 [ 10.774853] esp32_spi: process_esp_bootup_event: Received ESP bootup event [ 10.780822] esp32_spi: process_event_esp_bootup: Bootup Event tag: 3 [ 10.783628] esp32_spi: esp_validate_chipset: Chipset=ESP32 ID=00 detected over SPI [ 10.783658] esp32_spi: process_event_esp_bootup: Bootup Event tag: 2 [ 10.783664] esp32_spi: process_event_esp_bootup: Bootup Event tag: 0 [ 10.783669] esp32_spi: process_event_esp_bootup: Bootup Event tag: 1 [ 10.783673] esp32_spi: process_fw_data: ESP chipset's last reset cause: [ 10.783678] esp32_spi: print_reset_reason: POWERON_RESET [ 10.783687] esp32_spi: check_esp_version: ESP-Hosted Version: NG-1.0.3.0.0 [ 10.784718] esp32_spi: esp_reg_notifier: Driver init is ongoing [ 11.827393] esp32_spi: process_rx_buf: offset_rcv[0] != exp[12], drop [ 16.862570] esp32_spi: wait_and_decode_cmd_resp: Command[0x3] timed out [ 16.870931] esp32_spi: cmd_get_mac: wait_and_decode_cmd_resp(priv, cmd_node) failure, ret: -22 [ 16.882774] esp32_spi: esp_add_card: esp_add_network_ifaces(adapter) failure, ret: -1 [ 16.892394] esp32_spi: process_event_esp_bootup: network iterface init failed [ 24.572884] esp32_spi: esp_reg_notifier: Driver init is ongoing
root@rpi3-aplus-arm64-bookworm:~# dmesg | grep Bluetooth [ 7.787995] Bluetooth: Core ver 2.22 [ 7.803250] Bluetooth: HCI device and connection manager initialized [ 7.813743] Bluetooth: HCI socket layer initialized [ 7.826112] Bluetooth: L2CAP socket layer initialized [ 7.844690] Bluetooth: SCO socket layer initialized [ 12.966545] esp32_spi: init_bt: ESP Bluetooth init [ 14.567976] Bluetooth: BNEP (Ethernet Emulation) ver 1.3 [ 14.576024] Bluetooth: BNEP filters: protocol multicast [ 14.576061] Bluetooth: BNEP socket layer initialized [ 15.010549] Bluetooth: hci0: Opcode 0x0c03 failed: -110
root@rpi3-aplus-arm64-bookworm:~# hciconfig hci0: Type: Primary Bus: SPI BD Address: 00:00:00:00:00:00 ACL MTU: 0:0 SCO MTU: 0:0 DOWN RX bytes:0 acl:0 sco:0 events:0 errors:0 TX bytes:3 acl:0 sco:0 commands:0 errors:0
The SPI exchange doesn't look good to me, I am not sure what's wrong.
@kapilkedawat , can you please check this?
This is why bootup event isn't handled on the sunxi platform:
static int process_rx_buf(struct esp_spi_context *spi_ctx, struct sk_buff *skb)
{
struct esp_payload_header *header;
u16 len = 0;
u16 offset = 0;
if (!skb)
return -EINVAL;
header = (struct esp_payload_header *) skb->data;
esp_dbg("SPI RX buf: IF type = %d\n", header->if_type);
esp_hex_dump_verbose("SPI RX buf: ", skb->data , min(skb->len, 32));
if (header->if_type >= ESP_MAX_IF) {
return -EINVAL;
}
root@nanopi-a64-bookworm:~# dmesg [ 7.267605] esp32_spi: esp_init: esp_init: begin [ 7.277872] esp32_spi: esp_spi_probe: Probing ESP32 SPI-driver... [ 7.295706] esp32_spi: esp_spi_probe: raw-tp-mode was not specified [ 7.311987] esp32_spi: esp_spi_work: SPI work: handshake 1, data_ready 0, data_path 0 [ 7.319861] esp32_spi: esp_spi_probe: ESP32 peripheral is registered to SPI bus [0], chip select [0], SPI Clock [1], SPI mode [0x02] [ 7.333002] esp32_spi: esp_spi_work: SPI work: handshake 0, data_ready 0, data_path 0 [ 7.740060] esp32_spi: esp_spi_probe: Probe success. [ 7.746290] esp32_spi: esp_init: esp_init: end 0 [ 9.551373] esp32_spi: esp_spi_work: SPI work: handshake 0, data_ready 1, data_path 1 [ 9.559472] esp32_spi: esp_spi_work: SPI work: handshake 1, data_ready 1, data_path 1 [ 9.583694] esp32_spi: process_rx_buf: SPI RX buf: IF type = 6 [ 9.589589] SPI RX buf: 00000000b373ad7a: 06 00 00 00 3e 00 18 00 1e 04 00 00 02 00 30 00 ....>.........0. [ 9.599352] SPI RX buf: 00000000f89f8fc1: 2e 00 00 00 06 02 00 04 02 14 00 03 f0 02 18 9c ................ [ 9.611249] esp32_spi: esp_spi_work: SPI work: handshake 1, data_ready 0, data_path 1
The bootup event must be started from ESP_INTERNAL_IF which is 3.
It seems like the data received is shifted by 1 bit to the left, which makes 3 become 6, because MISO data comes a little earlier than needed, i.e. ESP chip sends MISO data out of sync with the clock. You can see it on the picture - MISO data must come a little later in SPI mode 2.
- Are you using jumper cables?
- What is the ESP chipset you use? Is it ESP32?
- Can you also share ESP side log?
- Does spi mode 1 or spi mode 3 (both sides same), also faces this issue?
- Does this issue arise only on module reload or even fresh load first time after system bootup?
On sunxi platform it seems to work well in SPI mode 1 only. On RaspberryPi I couldn't make it work in any of 3 modes. Will try to take ESP log later.
Can you please attach success and failed log for sunxi? I meant the logic analyser output
ets Jul 29 2019 12:21:46
rst:0x1 (POWERON_RESET),boot:0x13 (SPI_FAST_FLASH_BOOT)
configsip: 0, SPIWP:0xee
clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00
mode:DIO, clock div:2
load:0x3fff0030,len:7120
load:0x40078000,len:15616
load:0x40080400,len:4
load:0x40080404,len:3888
entry 0x40080650
I (29) boot: ESP-IDF e7771c75-dirty 2nd stage bootloader
I (29) boot: compile time Sep 23 2024 16:24:04
I (29) boot: Multicore bootloader
I (34) boot: chip revision: v3.1
I (38) boot.esp32: SPI Speed : 40MHz
I (43) boot.esp32: SPI Mode : DIO
I (47) boot.esp32: SPI Flash Size : 4MB
I (52) boot: Enabling RNG early entropy source...
I (57) boot: Partition Table:
I (61) boot: ## Label Usage Type ST Offset Length
I (68) boot: 0 nvs WiFi data 01 02 00009000 00004000
I (76) boot: 1 otadata OTA data 01 00 0000d000 00002000
I (83) boot: 2 phy_init RF data 01 01 0000f000 00001000
I (90) boot: 3 factory factory app 00 00 00010000 00100000
I (98) boot: 4 ota_0 OTA app 00 10 00110000 00100000
I (105) boot: 5 ota_1 OTA app 00 11 00210000 00100000
I (113) boot: End of partition table
I (117) boot: Defaulting to factory image
I (122) esp_image: segment 0: paddr=00010020 vaddr=3f400020 size=205a4h (132516) map
I (178) esp_image: segment 1: paddr=000305cc vaddr=3ffbdb60 size=05884h ( 22660) load
I (188) esp_image: segment 2: paddr=00035e58 vaddr=40080000 size=0a1c0h ( 41408) load
I (205) esp_image: segment 3: paddr=00040020 vaddr=400d0020 size=8ad44h (568644) map
I (411) esp_image: segment 4: paddr=000cad6c vaddr=4008a1c0 size=15a20h ( 88608) load
I (464) boot: Loaded app from partition at offset 0x10000
I (464) boot: Disabling RNG early entropy source...
I (475) cpu_start: Multicore app
I (476) cpu_start: Pro cpu up.
I (476) cpu_start: Starting app cpu, entry point is 0x40081448
I (0) cpu_start: App cpu up.
I (493) cpu_start: Pro cpu start user code
I (493) cpu_start: cpu freq: 240000000 Hz
I (494) cpu_start: Application information:
I (498) cpu_start: Project name: network_adapter
I (504) cpu_start: App version: NG-1.0.3.0.0
I (509) cpu_start: Compile time: Sep 23 2024 16:23:57
I (515) cpu_start: ELF file SHA256: 304f5b83d2a319cc...
I (521) cpu_start: ESP-IDF: e7771c75-dirty
I (527) cpu_start: Min chip rev: v0.0
I (532) cpu_start: Max chip rev: v3.99
I (536) cpu_start: Chip rev: v3.1
I (541) heap_init: Initializing. RAM available for dynamic allocation:
I (549) heap_init: At 3FFAFF10 len 000000F0 (0 KiB): DRAM
I (554) heap_init: At 3FFB6388 len 00001C78 (7 KiB): DRAM
I (560) heap_init: At 3FFB9A20 len 00004108 (16 KiB): DRAM
I (567) heap_init: At 3FFC7D18 len 000182E8 (96 KiB): DRAM
I (573) heap_init: At 3FFE0440 len 00003AE0 (14 KiB): D/IRAM
I (579) heap_init: At 3FFE4350 len 0001BCB0 (111 KiB): D/IRAM
I (586) heap_init: At 4009FBE0 len 00000420 (1 KiB): IRAM
I (593) spi_flash: detected chip: generic
I (596) spi_flash: flash io: dio
I (601) coexist: coex firmware version: 77cd7f8
I (606) app_start: Starting scheduler on CPU0
I (611) app_start: Starting scheduler on CPU1
mI (611) main_task: Started on CPU0
I (620) main_task: Calling app_main()
I (624) stats: *********************************************************************
I (632) stats: ESP-Hosted Firmware version :: NG-1.0.3.0.0
I (642) stats: Transport used :: SPI only
I (650) stats: *********************************************************************
I (658) FW_MAIN: Supported features are:
I (663) FW_MAIN: - WLAN over SPI
I (667) FW_BT: - BT/BLE
I (670) FW_BT: - HCI Over SPI
I (674) FW_BT: - BT/BLE dual mode
I (678) FW_MAIN: Capabilities: 0xf8
I (693) wifi:wifi driver task: 3ffbd694, prio:23, stack:6656, core=1
I (695) wifi:wifi firmware version: d23b633
I (696) wifi:wifi certification version: v7.0
I (696) wifi:config NVS flash: disabled
I (700) wifi:config nano formating: disabled
I (704) wifi:Init data frame dynamic rx buffer num: 40
I (709) wifi:Init static rx mgmt buffer num: 5
I (713) wifi:Init management short buffer num: 32
I (718) wifi:Init dynamic tx buffer num: 64
I (722) wifi:Init static rx buffer size: 1600
I (726) wifi:Init static rx buffer num: 16
I (729) wifi:Init dynamic rx buffer num: 40
I (734) wifi_init: rx ba win: 32
I (737) wifi_init: tcpip mbox: 32
I (741) wifi_init: udp mbox: 6
I (745) wifi_init: tcp mbox: 6
I (749) wifi_init: tcp tx win: 5760
I (753) wifi_init: tcp rx win: 5760
I (757) wifi_init: tcp mss: 1440
I (761) wifi_init: WiFi IRAM OP enabled
I (765) wifi_init: WiFi RX IRAM OP enabled
I (771) BTDM_INIT: BT controller compile version [a38dc5c]
I (777) BTDM_INIT: Bluetooth MAC: e4:65:b8:53:f9:0a
I (784) phy_init: phy_version 4791,2c4672b,Dec 20 2023,16:06:06
I (1183) FW_MAIN: ESP Bluetooth MAC addr: e4-65-b8-53-f9- a
I (1184) FW_SPI: Using SPI interface I (1184) gpio: GPIO[2]| InputEn: 0| OutputEn: 1| OpenDrain: 0| Pullup: 0| Pulldown: 0| Intr:0 I (1192) gpio: GPIO[4]| InputEn: 0| OutputEn: 1| OpenDrain: 0| Pullup: 0| Pulldown: 0| Intr:0 I (1202) gpio: GPIO[15]| InputEn: 0| OutputEn: 0| OpenDrain: 0| Pullup: 1| Pulldown: 0| Intr:0 I (1211) gpio: GPIO[15]| InputEn: 1| OutputEn: 0| OpenDrain: 0| Pullup: 1| Pulldown: 0| Intr:0 I (2222) FW_MAIN: Initial set up done I (2222) main_task: Returned from app_main() I (3541) FW_MAIN: INIT Interface command
I (3545) wifi:mode : sta (e4:65:b8:53:f9:08) I (3546) wifi:enable tsf I (3547) FW_CMD: station started and disabled softap mode
In this, clearly is not falling under correct timing. I used mode 2 for decode here.
I am trying to understand why the miso should delay. Can you please show the photo how you had connected this setup?
Are both esp and host using same spi mode 2?
From the log, the current chipset in use is classic esp32.
Can you also attach the esp32 side sdkconfig is use? This would help us know the configurations used.
The ESP32 SPI timing considerations could be referred from https://github.com/espressif/esp-hosted/blob/master/esp_hosted_ng/docs/porting_guide.md#adjust-spi-timings . Also note SPI timing considerations:
If DMA is enabled, a Device's launch edge is half of an SPI clock cycle ahead of the normal time, shifting to the Master's actual latch edge. In this case, if the GPIO matrix is bypassed, the hold time for data sampling is 68.75 ns and no longer a half of an SPI clock cycle. If the GPIO matrix is used, the hold time will increase to 93.75 ns. The Host should sample the data immediately at the latch edge or communicate in SPI modes 1 or 3.
Question is, can you prefer MODE 1 /3 when esp32 is used? Are the GPIOs used from IO_MUX or used over GPIO_Matrix?