SigDigger icon indicating copy to clipboard operation
SigDigger copied to clipboard

Segmentation fault when stopping playback

Open srs4511351 opened this issue 2 years ago • 44 comments

Computer: Raspberry Pi 4 Debian GNU/Linux 12 (bookworm) SDRplay RSP1A

develop branch SigDigger 0.3.0 custom build May 23 2023 (12.2.0) Using suscan version 0.3.0 (custom build on May 23 2023 at 01:25:58 (12.2.0)) Using sigutils version 0.3.0 (custom build on May 23 2023 at 20:42:30 (12.2.0))

When I press Run, it takes a few minutes to start reception. A few minutes after that, audio preview starts. I assume that this is due to not having fftwisdom saved. SigDigger seems to work normally.

When I press the Run button to stop reception, SigDigger closes and Segmentation fault appears on the terminal.

This does not happen with the release version and early develop branch installations like the version detailed below. SigDigger 0.3.0 custom build Nov 24 2022 (10.2.1) Using suscan version 0.3.0 (custom build on Nov 24 2022 at 23:12:12 (10.2.1 20210110)) Using sigutils version 0.3.0 (custom build on Nov 24 2022 at 23:09:37 (10.2.1 20210110))

I prefer the develop branch because the startup delay is quite long on previous versions.

With the develop branch of sigutils, CMake 3.20.0 or higher is required. The version on the Raspberry Pi distribution is cmake version 3.18.4. Can you make sigutils build with cmake version 3.18.4?

-----Steve

srs4511351 avatar May 24 '23 02:05 srs4511351

Hi Steve,

Could you run SigDigger from gdb and from there, copy here the backtrace of the error? You can do this by running SigDigger from terminal like this:

$ gdb `which SigDigger`

Then, from the gdb prompt, type run:

(gdb) run

Wait for SigDigger to start, make it crash, and then print a backtrace with this:

(gdb) bt

Regarding your question on Cmake, I am afraid we need 3.20 or higher. We introduced extensive build system restructuring some time ago, and 3.18 is no longer enough :(

BatchDrake avatar May 24 '23 04:05 BatchDrake

Thread 1 "SigDigger" received signal SIGSEGV, Segmentation fault. __GI___libc_free (mem=) at ./malloc/malloc.c:3362 3362 ./malloc/malloc.c: No such file or directory. (gdb) bt #0 __GI___libc_free (mem=) at ./malloc/malloc.c:3362 #1 0x0000fffff7cabd50 in SoapySDRStrings_clear () at /usr/local/lib/libSoapySDR.so.0.8-2 #2 0x0000fffff7cac294 in SoapySDRArgInfo_clear () at /usr/local/lib/libSoapySDR.so.0.8-2 #3 0x0000fffff7e87c78 in suscan_source_destroy (source=0xaaaaac6d8a60) at /home/pi/suscan/analyzer/source.c:1635 #4 0x0000fffff7e6a6b4 in suscan_local_analyzer_dtor (ptr=0xaaaaac6d85b0) at /home/pi/suscan/analyzer/impl/local.c:904 #5 0x0000fffff7e61e98 in suscan_analyzer_destroy (self=0xaaaaac27b9f0) at /home/pi/suscan/analyzer/analyzer.c:671 #6 0x0000aaaaaac51ff8 in Suscan::Analyzer::~Analyzer() () #7 0x0000aaaaaac52028 in Suscan::Analyzer::~Analyzer() () #8 0x0000aaaaaab58608 in SigDigger::Application::orderedHalt() () #9 0x0000aaaaaab5b084 in SigDigger::Application::onAnalyzerHalted() () #10 0x0000fffff681a608 in () at /lib/aarch64-linux-gnu/libQt5Core.so.5 #11 0x0000fffff680e6cc in QObject::event(QEvent*) () at /lib/aarch64-linux-gnu/libQt5Core.so.5 #12 0x0000fffff74ac0a0 in QApplicationPrivate::notify_helper(QObject*, QEvent*) () at /lib/aarch64-linux-gnu/libQt5Widgets.so.5 #13 0x0000fffff67dcd40 in QCoreApplication::notifyInternal2(QObject*, QEvent*) () at /lib/aarch64-linux-gnu/libQt5Core.so.5 #14 0x0000fffff67e00a8 in QCoreApplicationPrivate::sendPostedEvents(QObject*, in--Type <RET> for more, q to quit, c to continue without paging--c t, QThreadData*) () at /lib/aarch64-linux-gnu/libQt5Core.so.5 #15 0x0000fffff683f4e8 in () at /lib/aarch64-linux-gnu/libQt5Core.so.5 #16 0x0000fffff4f5774c in g_main_context_dispatch () at /lib/aarch64-linux-gnu/libglib-2.0.so.0 #17 0x0000fffff4f579e0 in () at /lib/aarch64-linux-gnu/libglib-2.0.so.0 #18 0x0000fffff4f57a84 in g_main_context_iteration () at /lib/aarch64-linux-gnu/libglib-2.0.so.0 #19 0x0000fffff683eaa8 in QEventDispatcherGlib::processEvents(QFlagsQEventLoop::ProcessEventsFlag) () at /lib/aarch64-linux-gnu/libQt5Core.so.5 #20 0x0000fffff67db258 in QEventLoop::exec(QFlagsQEventLoop::ProcessEventsFlag) () at /lib/aarch64-linux-gnu/libQt5Core.so.5 #21 0x0000fffff67e42dc in QCoreApplication::exec() () at /lib/aarch64-linux-gnu/libQt5Core.so.5 #22 0x0000aaaaaab534f0 in main () (gdb)

srs4511351 avatar May 24 '23 04:05 srs4511351

Hi,

I'll try to fix this during the day. In the meantime, inside of Suscan, go to analyzer/source.c and in the surroundings of line 1635, remove the following lines:

  if (source->settings != NULL) {
    for (i = 0; i < source->settings_count; ++i)
      SoapySDRArgInfo_clear(source->settings + i);
    free(source->settings);
  }

Rebuild, and tell me if it keeps crashing

Cheers,

BatchDrake avatar May 24 '23 05:05 BatchDrake

SigDigger no longer crashes when I stop reception. I will try it again with your fix. Don't hurry on my account.

----Steve

srs4511351 avatar May 24 '23 05:05 srs4511351

Hi again,

I've been trying to reproduce this crash, but no luck so far. Few things I see in your setup:

  • SDRPlay modules are well known to be problematic. Does this crash occurs with RSP1A only, or with other receivers as well?
  • I see SoapySDR is installed in /usr/local. Is this a custom build? When did you clone / build it?

Cheers,

BatchDrake avatar May 24 '23 06:05 BatchDrake

I have to boot to another system to see if the crash occurs on my other SDRs.

My SoapySDR was installed from source on 1/22/2023. Lib Version: v0.8.1-g9c4fa324 API Version: v0.8.200 ABI Version: v0.8-2

srs4511351 avatar May 24 '23 14:05 srs4511351

I have nearly the same setup, except I’m running 22.04 aarch64, but can’t test till I return next week. If it’s okay to ask, since it’s a different issue, could you try and stop the sdrplay service? Then start SigDigger again on the same Pu and same profile you used the RSP1A with, try to start SigDigger from command line and observe. I find that if there’s a profile for sdrplay and the service has gotten messed up (happens under certain situations) SigDigger will take forever to load as it just keeps trying to reach the radio. I’m only asking here since you happen to have the same hardware.

alphafox02 avatar May 24 '23 15:05 alphafox02

I booted a Raspberry Pi OS bullseye system that has the same crash. The Segmentation fault only occurs on the SDRplay device. The RTL-SDR and HackRF do not have the Segmentation fault.

----Steve

srs4511351 avatar May 24 '23 15:05 srs4511351

@alphafox02 Who are you asking to do this? What do you mean by "the same Pu"?

Simply stopping the SDRplay service causes many errors like: [ERROR] sdrplay_api_Open() Error: sdrplay_api_Fail [ERROR] Please check the sdrplay_api service to make sure it is up. If it is up, please restart it.

The device is not in the list of devices.

----Steve

srs4511351 avatar May 24 '23 15:05 srs4511351

Sorry I was asking you and I meant Pi. I realize it causes errors, I was trying to provide a way to simulate the service failed.

What I was saying is that once there’s a profile in SigDigger for the sdrplay device it ends up causing issues the next time SigDigger tries to start if something has happened to the sdrplay service, fixable by resetting the service of course.

I can make a separate ticket with details and I’ll test what you’re seeing as soon as I can.

alphafox02 avatar May 24 '23 16:05 alphafox02

It sounds like my test is what you wanted. SigDigger does look for the SDRplay and errors if it wasn't found. Stopping and starting the API will sometimes gets the API working again.

I have seen issues when the SDRplay is opened by certain applications the first time after a boot/reboot, that the API crashes. I haven't confirmed, but this seems to have happened with SigDigger. Running SoapySDRUtil --probe="driver=sdrplay" before running the application stops this from happening.

Edit: I ran SigDigger as the first application after reboot to access the SDRplay SDR. It did not crash.

----Steve

srs4511351 avatar May 24 '23 16:05 srs4511351

Okay, in this case I am going to mark this as 3rd-party (this is actually one of the multiple SDRPlay library/module bugs) and close this issue. You should report this here: https://github.com/pothosware/SoapySDRPlay3

BatchDrake avatar Jun 07 '23 15:06 BatchDrake

Since this developed at some point during work on the develop branch, I suggest that you may be able to do something about it.

This does not happen with the release version and early develop branch installations like the version detailed below. SigDigger 0.3.0 custom build Nov 24 2022 (10.2.1) Using suscan version 0.3.0 (custom build on Nov 24 2022 at 23:12:12 (10.2.1 20210110)) Using sigutils version 0.3.0 (custom build on Nov 24 2022 at 23:09:37 (10.2.1 20210110))

I don't know what to sat to the SoapySDR developer about this. The problem only appears in SigDigger. Could you add information enough for the developers to work with?

srs4511351 avatar Jun 07 '23 19:06 srs4511351

I received a reply from @fventuri about this problem.

In the issue that I opened, https://github.com/pothosware/SoapySDRPlay3/issues/71 he said: He said Looking at that output from gdb it seems to me that this issue might be similar to what was reported here: https://github.com/pothosware/SoapySDR/issues/361 - the comments there by @guruofquality and @zuckschwerdt might be helpful in understanding this problem.

If you could look this over, I hope it will help you.

----Steve

srs4511351 avatar Jun 08 '23 06:06 srs4511351

Hi,

I've just changed the way we free the argument info list. Now it is done exactly as how @guruofquality describes. Can you build again from develop and see if it keeps crashing?

Cheers,

BatchDrake avatar Jun 08 '23 06:06 BatchDrake

Do I only need to rebuild suscan?

I can do it later in the morning.

----Steve

srs4511351 avatar Jun 08 '23 06:06 srs4511351

Yes, I just changed the implementation of something, so it should be fine.

BatchDrake avatar Jun 08 '23 07:06 BatchDrake

I tried to install suscan, but make failed with errors:

[ 22%] Building C object CMakeFiles/suscan.dir/analyzer/inspector/interface.c.o
/home/dietpi/suscan/analyzer/inspector/inspector.c:66:5: error: unknown type name ‘su_specttuner_channel_data_func_t’; did you mean ‘su_specttuner_channel_t’?
   66 |     su_specttuner_channel_data_func_t on_data,
      |     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      |     su_specttuner_channel_t
/home/dietpi/suscan/analyzer/inspector/inspector.c:67:5: error: unknown type name ‘su_specttuner_channel_new_freq_func_t’
   67 |     su_specttuner_channel_new_freq_func_t on_new_freq,
      |     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[ 23%] Building C object CMakeFiles/suscan.dir/analyzer/inspector/overridable.c.o
In file included from /usr/local/include/sigutils/sigutils/block.h:28,
                 from /usr/local/include/sigutils/sigutils/sigutils.h:23,
                 from /home/dietpi/suscan/analyzer/inspector/inspector.c:28:
/home/dietpi/suscan/analyzer/inspector/inspector.c: In function ‘suscan_sc_inspector_factory_open’:
/home/dietpi/suscan/analyzer/inspector/inspector.c:210:13: warning: implicit declaration of function ‘suscan_inspector_open_sc_channel_ex’; did you mean ‘suscan_inspector_open_sc_close_channel’? [-Wimplicit-function-declaration]
  210 |     schan = suscan_inspector_open_sc_channel_ex(
      |             ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/usr/local/include/sigutils/sigutils/defs.h:110:9: note: in definition of macro ‘SU_TRYCATCH’
  110 |   if (!(expr)) {                         \
      |         ^~~~
/home/dietpi/suscan/analyzer/inspector/inspector.c:210:11: warning: assignment to ‘su_specttuner_channel_t *’ {aka ‘struct sigutils_specttuner_channel *’} from ‘int’ makes pointer from integer without a cast [-Wint-conversion]
  210 |     schan = suscan_inspector_open_sc_channel_ex(
      |           ^
/usr/local/include/sigutils/sigutils/defs.h:110:9: note: in definition of macro ‘SU_TRYCATCH’
  110 |   if (!(expr)) {                         \
      |         ^~~~
make[2]: *** [CMakeFiles/suscan.dir/build.make:398: CMakeFiles/suscan.dir/analyzer/inspector/inspector.c.o] Error 1
make[2]: *** Waiting for unfinished jobs....
make[1]: *** [CMakeFiles/Makefile2:87: CMakeFiles/suscan.dir/all] Error 2
make: *** [Makefile:136: all] Error 2

Guessing that I need to reinstall sigutils, I got more errors:

cmake ..
CMake Error at CMakeLists.txt:28 (include):
  include could not find requested file:

    GitVersionDetect


CMake Error at CMakeLists.txt:35 (project):
  VERSION ".." format invalid.


-- Configuring incomplete, errors occurred!

What do I need to install to get GitVersionDetect? How do I install it? I did install gitversion, but that must not be what I need.

srs4511351 avatar Jun 08 '23 17:06 srs4511351

You have to clone sigutils with recurse submodules:

$ git clone -b develop --recurse-submodules [email protected]:BatchDrake/sigutils.git

BatchDrake avatar Jun 08 '23 18:06 BatchDrake

From https://github.com/antoniovazquezblanco I placed GitVersionDetect.cmake, RelativeFileMacro.cmake and PcFileGenerator.cmake in ~/sigutils/cmake/modules. It compiled and suscan also compiled. Should I start it over

After I did that, I see your recurse submodules, but although SigDigger runs and receives signals, I still get the Segmentation fault. Should I start the installations over with recurse submodules? Do I need to go ahead and reinstall suwidgets and sigdigger?

I can get a gdb trace, but I have to install a lot if things to get gdb installed on the system I am using.

srs4511351 avatar Jun 08 '23 18:06 srs4511351

Yes, I thought you had a more recent version, my bad. The ABI changed a lot since you last built SigDigger, so you need to rebuild everything from Sigutils to SigDigger.

BatchDrake avatar Jun 08 '23 18:06 BatchDrake

I reinstalled everything for SigDigger and modules. I only used recurse submodules on sigutils.

It runs normally, but faults when I stop reception.

Thread 1 "SigDigger" received signal SIGSEGV, Segmentation fault.
0x0000007ff60ff610 in free () from /lib/aarch64-linux-gnu/libc.so.6
(gdb) bt
#0  0x0000007ff60ff610 in free () from /lib/aarch64-linux-gnu/libc.so.6
#1  0x0000007ff7cabd50 in SoapySDRStrings_clear ()
   from /usr/local/lib/libSoapySDR.so.0.8-2
#2  0x0000007ff7cac294 in SoapySDRArgInfo_clear ()
   from /usr/local/lib/libSoapySDR.so.0.8-2
#3  0x0000007ff7cac2e8 in SoapySDRArgInfoList_clear ()
   from /usr/local/lib/libSoapySDR.so.0.8-2
#4  0x0000007ff7e87c78 in suscan_source_destroy (source=0x5556ef76b0)
    at /home/dietpi/suscan/analyzer/source.c:1632
#5  0x0000007ff7e6a6c4 in suscan_local_analyzer_dtor (ptr=0x5556ab1e30)
    at /home/dietpi/suscan/analyzer/impl/local.c:904
#6  0x0000007ff7e61ea8 in suscan_analyzer_destroy (self=0x5556ef75e0)
    at /home/dietpi/suscan/analyzer/analyzer.c:671
#7  0x00000055557022a8 in Suscan::Analyzer::~Analyzer() ()
#8  0x00000055557022d8 in Suscan::Analyzer::~Analyzer() ()
#9  0x0000005555608608 in SigDigger::Application::orderedHalt() ()
#10 0x000000555560b084 in SigDigger::Application::onAnalyzerHalted() ()
#11 0x0000007ff681a608 in ?? () from /lib/aarch64-linux-gnu/libQt5Core.so.5
#12 0x0000007ff680e6cc in QObject::event(QEvent*) ()
   from /lib/aarch64-linux-gnu/libQt5Core.so.5
#13 0x0000007ff74ac0a0 in QApplicationPrivate::notify_helper(QObject*, QEvent*)
    () from /lib/aarch64-linux-gnu/libQt5Widgets.so.5
#14 0x0000007ff67dcd40 in QCoreApplication::notifyInternal2(QObject*, QEvent*)
--Type  for more, q to quit, c to continue without paging--c
    () from /lib/aarch64-linux-gnu/libQt5Core.so.5
#15 0x0000007ff67e00a8 in QCoreApplicationPrivate::sendPostedEvents(QObject*, int, QThreadData*) () from /lib/aarch64-linux-gnu/libQt5Core.so.5
#16 0x0000007ff683f4e8 in ?? () from /lib/aarch64-linux-gnu/libQt5Core.so.5
#17 0x0000007ff4f5774c in g_main_context_dispatch ()
   from /lib/aarch64-linux-gnu/libglib-2.0.so.0
#18 0x0000007ff4f579e0 in ?? () from /lib/aarch64-linux-gnu/libglib-2.0.so.0
#19 0x0000007ff4f57a84 in g_main_context_iteration ()
   from /lib/aarch64-linux-gnu/libglib-2.0.so.0
#20 0x0000007ff683eaa8 in QEventDispatcherGlib::processEvents(QFlags<:processeventsflag>) () from /lib/aarch64-linux-gnu/libQt5Core.so.5
#21 0x0000007ff67db258 in QEventLoop::exec(QFlags<:processeventsflag>) () from /lib/aarch64-linux-gnu/libQt5Core.so.5
#22 0x0000007ff67e42dc in QCoreApplication::exec() ()
   from /lib/aarch64-linux-gnu/libQt5Core.so.5
#23 0x00000055556034f0 in main ()
(gdb) 

----Steve

srs4511351 avatar Jun 08 '23 19:06 srs4511351

Well, I did it how @guruofquality instructed. This only fails with SDRPlay. It fails because now I use this API to clear the argument info I previously retrieved.

BatchDrake avatar Jun 08 '23 19:06 BatchDrake

Look, let us try this (this is going to make SigDigger run absolutely slow and will take a few minutes to start up, but will generate an interesting log).

Install valgrind from the repos, and run:

$ valgrind --track-origins=yes SigDigger 2> errors.log

Repeat the process that makes SigDigger crash (even if it does not crash after all), and send the errors.log back to me.

BatchDrake avatar Jun 08 '23 19:06 BatchDrake

Perhaps the SoapySDR people can help. I mentioned that only the SDRPlay device does this, but that didn't seem to matter.

My other applications work.

----Steve

srs4511351 avatar Jun 08 '23 19:06 srs4511351

I will install the package: valgrind/testing 1:3.19.0-1 arm64 then try your instructions.

srs4511351 avatar Jun 08 '23 20:06 srs4511351

If this is a double-free error then ASAN might be able to track that and give a detailed output. A debug build should only need

    set(CMAKE_C_FLAGS_DEBUG "-ggdb -fsanitize=undefined -fsanitize=address -fno-omit-frame-pointer")

then e.g. cmake -DCMAKE_BUILD_TYPE=Debug ..

zuckschwerdt avatar Jun 08 '23 20:06 zuckschwerdt

SigDigger did not run. I got a Segmentation fault. errors.log

Correction: SigDigger did start, it displayed the splash screen before faulting.

srs4511351 avatar Jun 08 '23 20:06 srs4511351

The interesting part is right at the start

==2340== Source and destination overlap in memcpy(0xc49f570, 0x487e010, 88887341568)
==2340==    by 0x1E517C6F: memcpy (string3.h:53)
==2340==    by 0x1E517C6F: sdrplay_api_GetDevices (sdrplay_api.cpp:321)
==2340==    by 0x1E4CE097: findSDRPlay(...)

findSDRPlay() calls sdrplay_api_GetDevices() which then breaks (the len is not plausible) trying to copy a string. This is not the same error as before.

zuckschwerdt avatar Jun 08 '23 20:06 zuckschwerdt

This time, the application crashed on startup. Previously, it crashed when playback was stopped, after working for a few minutes.

srs4511351 avatar Jun 08 '23 20:06 srs4511351