Player crashes during transitions when color depth is forced to 16bpp
Name of the game:
Wadanohara...Also got the crash with Pom gets Wifi and i believe any game using screen transitions is affected.
Player platform:
Linux : MIPS32 mipsel (little endian) and ARMel (Armv5). SDL 1.2, sound support was disabled. (Doesn't matter as i can confirm sound support does not affect the crash in any way and works fine)
Attached files
- The
easyrpg_log.txtlog file easy_log.txt (Note that the log got corrupted. I had to extract its content with a hex editor) - GDB output for crash GDB_log.txt
- Makefile i use for MIPS32 Makefile.txt
Describe the issue in detail and how to reproduce it:
Every time a game fades in (i've tested it with Wadanohara, Pom Gets Wifi), the EasyRPG Player crashes horribly. It is reproduceable and it always crashes at the exact same place. (In the case of Wadanohara, game fades to black in the opening. It's supposed to show the picture for the 1st chapter, the very beginning but it crashes before it even goes there)
Note that i could only reproduce this issue on MIPS32r1 (little endian) and ARMv5 (little endian) SoCs, operating system is Linux (2.6.31, 3.14 for MIPS32, 4.14 for ARMv5) and i'm using SDL 1.2. The same configuration, Makefile on a PC (x86_64) works just fine, including with SDL 1.2. On those platforms though, it crashes. It also crashes on QEMU-MIPS the same way, which allowed me to get a gdb output, i have attached the log as GDB_output.txt and the Makefile as Makefile.txt. (see above in Attached files)
If you need my patched buildroot toolchain for compiling it yourselves on MIPS, you can get the source here https://github.com/rs-97-cfw/buildroot
Let me know if you are interested in running the binary yourself (or need extra info, just provide me the instructions). Note that i am unable to give the SDL 2 backend a try on those platforms as it is unavailable there. (Only framebuffer is available, no hardware acceleration and SDL2 does not support fbcon)
EDIT: It could be a buildroot issue or a GCC issue or an issue with your code. (it's not exactly clear what's wrong with it)
Thanks for the report.
How much RAM does the system have? Usually everything with less than 64 MB is problematic for almost any game. Does the crash go away when you enable swap?
If that doesn't help: could you rerun with valgrind? Will run extremely slow but maybe helps finding a memory corruption.
Hello Ghabry, the platform i'm running this is the RS-97. It's a MIPS32r1 Ingenic SoC with an hardware FPU, 128MB of DDR2 RAM. It is configured with 256 MB of Swap memory (but i suspect that swap is not working properly....) but stuck on Linux 2.6.31.3. The ARMv5 platform, the New bittboy, is an ARMv5 Soc clocked at 533MHz (no FPU), only 32MB of RAM and configured with 256MB of Swap. (However, swap on this handled is 1bit only and extremely sluggish, thankfully EasyRPG does not seem to be too punishing) The MIPS emulator has 512MB of RAM allocated and is based on the rootfs of the GCW0 and linux 3.14. (My binary however is statically built with GCC 8.2, musl so it does not care for the system libraries)
Also, the crash seemingly happens with and without swap. (The bittboy though, requires swap due to its RAM limitations) The RS-97 is not starving for memory when running EasyRPG.
I should have valgrind available and i will give it a try asap, but i'm not holding my hopes for it...
Will also do a run with my PSP (also MIPS, also LE, also SDL1.2). Last time I tried it, it worked, so we might have broken things in the meanwhile.
Unfortunately i can't use Valgrind at the moment on those devices. But i wanted to add that i updated to GCC 8.3 (from 8.2) and it still crashes and doesn't address our issues. I think my settings when i built the toolchain were too aggressive and it could have broke libpixman which EasyRPG relies on. I'mma try that with compiling O0 for pixman (especially since GDB says it's the culprit) and i will let you know. EDIT : This was sadly not sucessful so what i did instead was to use uclibc instead and the results are a bit different. Instead of crashing on malloc like musl-libc would, it seemingly freezes. Honestly though that's not much better...
@gameblabla
Instead of using valgrind you can try to build with address sanitizer. This runs at full speed and detects memory corruption bugs.
You need to rebuild from scratch with -fsanitize=address in your build and link flags.
In cmake you can add -DCMAKE_CXX_FLAGS="-fsanitize=address" -DCMAKE_EXE_LINKER_FLAGS="-fsanitize=address". In your makefile you should be able to just add it to CFLAGS and LDFLAGS. I do all my local testing on linux with this enabled.
Yo i'm back. Sadly Address sanitizer doesn't seem to be available for the MIPS target... : ( So i had to compile a working build of Valgrind, which i managed to do after a few months or so.
The executable was so huge i had to cut on the dependencies so i could run it somewhat decently (it still crashes as usual though).
My first run with Valgrind gave me this log.txt
Not too useful sadly so i reran it with track origin to yes log_run_2.txt This time however, it became stuck on the unbin/malloc bit sadly and i had to turn off my console.
Doesn't seem like it's very useful sadly...
This was tested against Player commit 96c5e721725dea641ffda7e18d48a127df2448ab and liblcf commit b23556194fef2cdbe3962631b708e58e10072f4c
EDIT Ran it a 3rd time with -s and it did not crash this time, here's the log log_run_3.txt
I also redid it with leak check set to full and show-leak-kinds=all. Unfortunately, it froze before the crash this time again : log_run_4.txt
This is a bit more comprehensible this time, hopefully it helps you guys. Mind you that i had to compile the executable with -Os -g1.
Are you sure you're not running out of memory? The original crash is in malloc() so it looks like it could be this.
Are you sure you're not running out of memory? The original crash is in
malloc()so it looks like it could be this.
You could be right in your assumption because i just tried compiling a new SDL 1.2 build on my RG-350 and it ran just fine ! (that machine has 512MB of DDR2 RAM) But if true, it would mean that transitions would consume more than 128MB of RAM... I did switch back to GCC 7.5 though because GCC 8+ had a lot of issues in my experience. I guess i should try it for the RS-97 again and report back.
Very unlikely transitions using that much memory alone.
However to verify, I think you could add a return statement to the beginning of Transition::Init() to completely disable transitions.
I should try your hack later for the RS-97/Bittboy. Just wanted to say that EasyRPG works fine on the GKD350H and that thing only has 128MB of DDR2 RAM. (although granted, it has 256MB of swap memory but it runs much better than on the RG-350 so i don't think we are even hitting the barrier there...)
We just recently merged a bunch of improvements to transitions which also may have reduced memory usage. You could try again with the continuous build. Let us know how it goes.
So i just tried it with GCC 7.5 and latest master commit (i had to use my own Makefile as i have noticed you have recently removed it) and it still crashes on the LDK with its JZ4760B. Same issue, same exact place where it crashes and yadada.
based on the valgrind reports there seems to be a problem with the musl support in valgrind. Maybe the malloc is not properly supported so the output is not really helpful. It shows uninitiaized values caused by a heap expansion...
What would be worth a check is using a distribution that uses musl C-Runtime (Alpine Linux) and testing if the Player works there to exclude some musl incompatibility.
based on the valgrind reports there seems to be a problem with the musl support in valgrind. Maybe the malloc is not properly supported so the output is not really helpful. It shows uninitiaized values caused by a heap expansion...
What would be worth a check is using a distribution that uses musl C-Runtime (Alpine Linux) and testing if the Player works there to exclude some musl incompatibility.
I haven't tried on x86_64 musl (something like Void linux) but i did try it on the GKD350H as i said earlier and that was with using musl too and it worked there (the architecture there was mips32r2, and not mips32r1. That's basically the only difference). It's possible that there's a regression with GCC 7.x that was fixed on a later GCC release that i had used for the GKD350H (it was GCC 9.2 or 9.3 i believe). Once GCC 10.2 is released, i'll give that a try and see if it fixes stuff or not.
(I realized that i should have said which commit it was when i did the testing... oh well)
@gameblabla can you point me to the toolchain you are using? There are currently a lot of opendingux forks that target different devices unfortunately.
@carstene1ns It's this one i use for MIPS32r1 JZ4740/JZ4760 dingux devices : https://github.com/rs-97-cfw/buildroot
Note that my toolchain is fully statically linked so that means no support for SDL2 (which runs too slowly on these devices anyway) so you have to use SDL 1.2. There's also a bunch of hacks to compile everything with mno-abicalls and no PIC/PIE.
There's also an issue with buildroot & mpg123 : it's not compiled properly.
I have to compile it manually with
./configure --host=mipsel-linux --target=mipsel-linux --disable-shared --enable-static --prefix=/opt/rs97-toolchain/mipsel-buildroot-linux-musl/sysroot/usr
(while exporting the bin folder of my toolchain, something like export PATH=/opt/rs97-toolchain/usr/bin:$PATH)
Btw, this still works with a custom Makefile and a toolchain that has GCC 7.5 or higher (like mine with SDL 1.2 only : https://github.com/gcw0/buildroot-static or this one https://github.com/WerWolv/rg350_buildroot_gcc10). EasyRPG compiled with such a toolchain works on the GKD350H. (ignoring other regressions unrelated to it, of course)
However, GCC 8+ is known to be troublesome when targeting MIPS32r1 instead, not MIPS32r2. (i encountered multiple crashes in other apps) So i guess this is something that i should try next on my LDK with a JZ4760/MIPS32r1 but at least it still works on MIPS32r2 devices.
Ok so some progress !
- The opendingux team released a new MIPS32r2 toolchain along with a new beta opendingux firmware. Both can be found here : https://app.box.com/s/v63a9ao8ppm2cc453q7iqyna04c62arn https://app.box.com/s/0yistneti2vewi9aa6ttwog3v4pfx6kv
I know you guys complained about the lack of a recent toolchain for the GCW0/RG-350 so this should fix it.
- I have updated my statically linked toolchain for the RS-97 with GCC 10.2 and musl libc 1.2.1. I also enabled a workaround due to an FPU bug on the JZ4760. Now it can get past the transition, although the transition itself is very... glitchy looking. https://youtu.be/Pxq1r2c8_rA
I should probably run it via valgrind as well but yeah, still crashing : ( I hope the OD team can also port the firmware to the LDK/RS-97 so i can rule out a kernel issue. I should try it on the bittboy again too.
I compiled it again for the bittboy/pocket and same exact issue as on the RS-97... It now transitions but it looks glitchy and then it crashes, just like on the LDK/RS-97. The bittboy uses an ARMv5 soc with no FPU while the LDK/RS-97 is a MIPS32r1 SoC with a hardware FPU (that has some bugs but i enabled a compiler workaround to avoid it). Obviously there's something with both in common that causes both to crash as both are using the same versions for the libraries and everything, down to the compiler and libc... I guess that means though, that i will have to use valgrind on it. The pain
But at least it works fine on MIPS32r2 now.
Hm maybe the FPU gives a hint? Can you try different softfp options
Slowly sounds like some CPU/compiler bug we hit here
There aren't any other softfp options sadly for ARM. We might be hitting a compiler bug indeed (i encountered plenty of those already). Other than doing valgind or GDB again, i'm not sure how to proceed from there.
I was able to reproduce the issue on my GCW0 by sheer chance... https://github.com/EasyRPG/Player/blob/master/src/sdl_ui.cpp#L310
This causes the issue. Why ? On the GCW0 with a certain SDL version (the upstream version provided with the toolchain also wouldn't work anymore so took me a while to figure it out and i had to use my statically linked toolchain with GCC 7.5/musl), this would i believe return 32bpp instead of 16bpp. When it is forced to 16bpp, the same glitches that happened on the LDK/Bittboy, also happen on that platform. So setting bpp to 16 will also trigger it.
The RetroFW platforms only support 16bpp for performance reasons and return 16bpp as a result. Bittboy is similar except that it only supports that mode so it also returns 16bpp.
I would need to try it with the newer toolchain just to make sure and force 32bpp on that one but so far it seems to be the culprit, some kind of a buffer overflow.
Yup, forcing 32bpp fixes EasyRPG... If you force 16bpp instead, the issue will happen. I just tried it with the upstream toolchain and was able to make it work on that too. However forcing 32bpp on the LDK or Bittboy isn't desirable nor possible on the latter without converting between buffers, which would be a huge waste of performance.
Great find, could you confirm if this happens when using 16 bpp in SDL 2.0 if selected ports support it?
Great find, could you confirm if this happens when using 16 bpp in SDL 2.0 if selected ports support it?
This doesn't happen with SDL2. Looking at the code, it's not hard to see why : it seems that most of the code assumes a bitdepth of 32 and SDL2 does, by default, init a 32bpp window and there's no easy way to change from that unlike SDL 1.2.
I also attempted to use the 8bpp mode in the SDL 1.2 backend given that the new beta OpenDingux firmware supports 8bpp paletted surfaces (SDL_HWPALETTE | SDL_HWSURFACE) but it seems to just crash for now.
I suspect that the suspect might be Bitmap::Create in SdlUi:RefrsehDisplayMode(). Now we know though that it is not MIPS32 or ARMv5 specific but 16bpp specific.
Ooooh, a 16bit platform. Didn't think of this! Great find.
Our 16bit code paths are bit-rotting. Actually you MUST use 32bit for our Player and convert to 16bit for the output buffer. Hard to believe but it is actually FASTER than doing 16bit in Player because 16bit blits are much slower than 32bit blits.
Ooooh, a 16bit platform. Didn't think of this! Great find.
Our 16bit code paths are bit-rotting. Actually you MUST use 32bit for our Player and convert to 16bit for the output buffer. Hard to believe but it is actually FASTER than doing 16bit in Player because 16bit blits are much slower than 32bit blits.
Yeah because i would assume that pixman would still do calculations in 32bpp internally before converting it to RGB565. So i guess that doing it in one go with SDL would be faster than that. That said though, i looked bitmap.cpp and to me, it seems like a lot of it could just use SDL directly, as it even has a stretching function (but it's undocumented, SDL_SoftStretch. Doesn't work if the incoming bpp isn't the same as the output but otherwise works faster for 16bpp than SDL_gfx as SDL_gfx lacks a 16bpp path).
Perhaps in the future it would be worth considering targets to use their own bitmap.cpp replacement ? Because rendering is a huge bottleneck on low end platforms like the JZ4760 or JZ4770.... : /
That said though, this feature request doesn't apply to this bug so in the meantime, we need to create a virtual screen surface in case the bpp is not 32
SDL_CreateRGBSurface(SDL_HWSURFACE, width, height, sdl_screen->format->BitsPerPixel, rmask, gmask, bmask, amask);
Or something like that and then do a
SDL_BlitSurface(virtual_scr, NULL, sdl_screen, NULL);
The issue of course, is that the RS-97/LDK toolchain use a hack to force 16bpp seemingly. The hack can be removed but we could always do it manually with something like
static uint16_t rgba888Torgb565(uint32_t s)
{
unsigned alpha = s >> 27; /* downscale alpha to 5 bits */
if (alpha == (SDL_ALPHA_OPAQUE >> 3)) return (uint16_t) ((s >> 8 & 0xf800) + (s >> 5 & 0x7e0) + (s >> 3 & 0x1f));
return s = ((s & 0xfc00) << 11) + (s >> 8 & 0xf800) + (s >> 3 & 0x1f);
}
uint16_t* output;
output = sdl_screen->pixels;
if (SDL_LockSurface(sdl_screen) == 0)
{
for (uint32_t i = 0; i < (window_width * window_height); i += 1) *output++ = rgba888Torgb565(gfx_output[i]);
SDL_UnlockSurface(sdl_screen);
}
Or we could use both, but use this loop for RGB565 and SDL_BlitSurface for platforms that aren't broken. I might look into removing the 16bpp hack and see if it fixes things.
EDIT: I'm starting to think it might be a better idea to just let pixman render at 32bpp and convert directly the pixman ARGB surface to RGB565... Though unfortunately it doesn't seem that the code assumes as much. (or maybe it does but it's just slow at doing it ?)
Sooo i retried it on the LDK (where it also crashes) and this time, i create a 32bpp buffer internally (thanks to SDL_CreateRGBSurface) and convert it to RGB565 with a simple function. Unfortunately, it also crashes in the same way as it did before, unlike the GCW0 port which crashed if forced to 16bpp...
Btw, i had also reverted the patch to my toolchain that forced it to 16bpp so it couldn't be that either.
Maybe it's something else ? I don't know but it's certainly within the functions that use pixman again... ;/
Well way to go would be a openGL bitmap. Especially legacy GL would be useful because the PS Vita supports this but not sure what your devices can do. Only GL ES?
Well way to go would be a openGL bitmap. Especially legacy GL would be useful because the PS Vita supports this but not sure what your devices can do. Only GL ES?
The devices that we are talking about (LDK/RG-300/RS-97, Bittboy/PocketGo v1) do not have a dedicated or integrated GPU, and therefore don't support 3D acceleration. You just have access to the screen's framebuffer (RGB565) and that's it.