Problems with hot reload
Godot version
master, b6223c0df0300ba2db17b5742c349f13c33f8884
godot-cpp version
master, d47758910428242169ebe59329b449edf16036e0
System information
macOS Sonoma 14.6.1, M3 Max
Issue description
I had to add this in my cmake to both my extension and godot-cpp to enable reload support. Without this, it was spamming that it detected extension was changed on disk, but was asking if reloading was supported and yes, GODOT_ENABLE_HOT_RELOAD was ON before including godot-cpp.
target_compile_definitions(godot-cpp
PRIVATE
HOT_RELOAD_ENABLED
)
Now that godot reloaded my extension, my custom node in the editor lost its properties. It is derived from Node3D and the correct type is shown when I hover that node in the scene tree, however custom properties are lost, and scripts are screaming that they can't find base methods from custom node.
Node in inspector after reload:
Any idea if is it me or something is broken, maybe it's apple specific problem?
Steps to reproduce
N/A
Minimal reproduction project
N/A
Hi, I was just attempting to update to Godot 4.3 (from 4.2.2), and I'm encountering a similar issue.
System info is:
Godot v4.3.stable - macOS 14.6.1 - Vulkan (Mobile) - integrated Apple M3 Max - Apple M3 Max (16 Threads)
And godot-cpp version is godot-4.3-stable.
On initial load of my project, everything seems to work fine. However, when I recompile the C++ library, going back to Godot elicits thousands of the same errors:
Unimplemented _get_importer_name in add-on.
Unimplemented _get_recognized_extensions in add-on.
Like the original author, I also find that all my custom derived Node or Node3D subclasses stop appearing in the inspector, and I get GDScript errors about my bound methods no longer being present. The only solution is to reload the editor (and be sure not to save anything) to resolve.
There seems to be some general problem with hot reload:
- It seems to be losing registered types (they all stop appearing in the editor)
- Overridden importer functions are lost. I do have _get_importer_name and _get_recognized_extensions implemented for my classes. Looking at the source, it appears a call to GDVIRTUAL_CALL fails after hot reload.
- Bound functions are no longer accessible in GDScript, so it seems the bind data from
_bind_methodsgets lost.
Possibly some change between 4.2.2 and 4.3 has broken hot reloading, at least on Mac? I'd expect a lot of people to be complaining if it was a more widespread issue.
Just want to confirm that I tested the update from 4.2.2 to 4.3 on Windows 10, and it seems to work OK. I am not seeing these hot reload issues on Windows.
This could be related to https://github.com/godotengine/godot/issues/90108, which does seem to be trickier on MacOS than Linux.
I am very out of my depth here, but I was looking into the problem, and here's some info that may or may not be helpful.
The issue you linked mentions dlclose not actually closing the library as likely related to "thread-local destructors that are inserted by the compiler." And elsewhere, I see that dlclose may not actually unload a library on macOS for a few reasons, such as using a macOS symbol that causes unloading to not actually unload.
Some places (like https://stackoverflow.com/a/13403745) recommended using nm to see if any symbols are being used that might cause the problem.
Using nm to compare my dylib between 4.2 and 4.3, there are A LOT of symbols of course, but most of them related to Godot. One that did jump out to me was that the 4.3 symbols list contained __tlv_bootstrap whereas the 4.2 list notably did not.
Searching for that, it is a symbol that gets linked into your library when "thread local variables" are being used - see this link.
Anyway, I don't understand if this is actually relevant or even how to test it, but I am at least seeing some concern that thread-local stuff might break unloading of dynamic libraries, and that for some reason the 4.3 version of my library is including this thread-local symbol when the 4.2 version did not.
May be completely irrelevant, but there it is!
Just to add on to my comment above, I did notice that godot-cpp appears to have added some thread_local variables in 4.3 that were not present in 4.2.x. I see them in wrapped.cpp.
I found a discussion on the Apple Developer Forums where an Apple Engineer says that using thread_local will stop a dylib from being unloaded: https://forums.developer.apple.com/forums/thread/664480
In addition to what is stated in the man page there are several other features that prevent a dylib from being unloaded. Those include declaring any Objective C or Swift classes, as well as declaring any variables to have thread_local storage in C++.
I'm curious what was the impetus for introducing the thread_local variables in godot-cpp, and is there any other way to solve that problem without using thread_local variables?
I found a discussion on the Apple Developer Forums where an Apple Engineer says that using
thread_localwill stop a dylib from being unloaded: https://forums.developer.apple.com/forums/thread/664480
Oof, that's unfortunate. If you remove the thread_local keyword from those variables, does it fix the issue?
I'm curious what was the impetus for introducing the
thread_localvariables in godot-cpp, and is there any other way to solve that problem without usingthread_localvariables?
In order to get the API we want, we have to set these global variables used in the Wrapped() constructor, rather than passing the values into the constructor as would normally be done. In a single-threaded context, this is fine - we set the variables, and then immediately call the constructor where they are used. However, in a multi-threaded context, if multiple threads were doing this at the same time, there could be some messy race conditions where the value is set by one thread, but then used by another, for example. Setting the variables to thread_local means that we actually have a different variable per thread.
It definitely is possible to do without thread_local, just more complicated. We could keep using a single set of globals, but use a mutex to prevent multiple threads from creating objects at the same time - this would be fairly simple, but could really slow things down in a multi-threaded context. Or, we could attempt to implement our own version of thread_local, which would be much more complicated, but potentially without the performance issues.
Anyway, let me know if removing the thread_local fixes it for you - if so, we'll need to come up with something!
Good point - I probably should have just tried that!
I regret to inform that removing the thread_local keywords from wrapped.hpp and wrapped.cpp does appear to fix the issue. 💀
My repro steps:
- Ran Godot 4.3 per usual, did a hot reload, got a lot of errors in the Output tab, things were broken.
- Closed Godot.
- Modified the godot-cpp Wrapped files to remove thread_local.
- Did a clean and full rebuild of the library.
- Reopened Godot 4.3.
- Made a minor code change to force hot reload.
- Going back to Godot, no errors, everything seems functional.
Speaking of reimplementing thread_local, a thought that crossed my mind was maybe having maps to hold this data, guarded by a mutex, and perhaps keying the map using a "thread id" of some sort, if such a unique ID is readily accessible.
Also, since this seems to only be an issue with hot reload in editor, perhaps the "slow" approach you mention could be used for debug builds only. Using thread_local on Mac in release builds is probably not a problem, since I don't think those libraries need hot reload support?
I regret to inform that removing the
thread_localkeywords fromwrapped.hppandwrapped.cppdoes appear to fix the issue. 💀
Thanks! I'll put together a PR that replaces thread_local on MacOS.
Also, since this seems to only be an issue with hot reload in editor, perhaps the "slow" approach you mention could be used for debug builds only. Using
thread_localon Mac in release builds is probably not a problem, since I don't think those libraries need hot reload support?
Ah, good idea! I think I will go for the "slow" approach and make it only for editor and debug builds.
I posted PR https://github.com/godotengine/godot-cpp/pull/1594 which swaps out thread_local for a mutex. Please let me know if it fixes the issue for you!
@dsnopek, sorry for the delay, but I did just try this out - and it appears to work! I can recompile the lib on Mac, go back to editor, and no more errors output - appears to work how it used to do.
Thank you for taking the time to put that together!