Picotool coprodis unusably slow with large program (W11)
MMBasic creates an image which is just under a Megabyte (rp2040). elf2uf2 used to create the uf2 in a couple of seconds. picotool takes several minutes and a processor is running a maximum during this period (W11 PC with I7-12700). It does eventually complete and appears to create a valid uf2. I believe the problem is in the creation of the dis file which is 11.4Mbytes. If I kill picotool using the task manager the UF2 is created but the dis creation fails
See in the C/C++ SDK book, you can pass set PICO_NO_COPRO_DIS=1 to cmake, or put it in your CMakeLists.txt, though it should still not be this slow
@UKTailwind Can you provide your elf file, so I can run picotool myself to see where the bottleneck is?
"you can pass set PICO_NO_COPRO_DIS=1 to cmake"
That fixes it so definitely a problem in the creation of the .dis file
This appears to be both a compiler issue and a Windows vs Linux issue - on my 13900H laptop it takes:
- ~5mins when using picotool compiled with MSVC 2022 on Windows
- 35s using the precompiled binary at pico-sdk-tools (compiled using gcc in NSIS2) on Windows
- 8s when compiled with gcc in WSL
So further investigation is definitely warranted, but I'd recommend switching to the precompiled binaries if you can (you can point pico-sdk at these by setting -Dpicotool_DIR=/path/to/picotool in your cmake), or even switching to WSL.
This is only necessary if you want the extra coprocessor dissassmebly functionality - if you're not planning on thoroughly reading the dissassembly, then just setting PICO_NO_COPRO_DIS=1 is probably the best. The extra coprocessor dissassembly just turns some mcr and similar instructions which send/receive from the coprocessors into more meaningful rcp, dcp or gpio instructions for readability.
Thanks for looking at it. The strange thing is that elf2uf2 used to create the disassembly listing almost instantaneously. picotool appears to be a win32 application, is the linux version only 32-bit or could that be an issue?
elf2uf2 didn’t do any disassembly, the disassembly was performed by the objdump in your toolchain and still is - picotool just modifies the coprocessor lines in that disassembly file.
The picotool I compiled was 64-bit in Windows and Linux so I don’t think that’s the issue. From a quick investigation, it looks like the regex search is just taking far longer on MSVC Windows - taking up 75% of the execution time, whereas on Linux it only takes 8% of the execution time
From looking into this, it seems that the MSVC regex library is just ridiculously slow on Windows, so for larger programs I think the recommendataion would be to use a picotool compiled with a different regex library, such as the pre-compiled one in pico-sdk-tools. You could also use chocolatey to install gcc and use that to compile picotool, which is what the GitHub actions use to test pico-sdk and pico-examples - see this file for the shell commands
yes, picotool.exe absurdly slow! it takes 5 min to make uf2.
I hear reports that this is really slow on Pi 4 too. let's take another look at performance as it shouldn't take minutes to do this on anything!