[Regression] Access violation when implementing native interface by managed class
Brief Description
After update from CppSharp 0.10.2 to 1.0.0 we've noticed that semantics of virtual table substitution in generated C# changed slightly, causing Access violation (reads of random memory) upon instantiation of managed implementation of native interface (this is mouthful, rather see code below :).

The issue seems to origin at 09222174c1a71445e1ca2d (Made the original virtual tables static too) where initialization timing of VTable changed from lazy-based-on-destructorOnly to eager-regardless-of-destructorOnly:
https://github.com/mono/CppSharp/commit/09222174c1a71445e1ca2debc654f984afae47c0#diff-b771f00937690119aa0c90ea5fece77f0215bf232a7b3273d7729de184083ed8R1570
Fast-forward though all patches to current version, this ultimately changes generated code so that full VTable is read from instance regardless if it's actually present or not.

Should the first call to SetupVTables come from arg-less/default managed ctor of managed implementation of native interface, crash occurs.
Used headers and generated code
//Native header (Simple pure virtual class/interface)
class EXPORT MethodDiagnosticListener
{
public:
NO_COPY_MOVE(MethodDiagnosticListener)
MethodDiagnosticListener() = default;
virtual ~MethodDiagnosticListener() = default;
virtual void HandleDiagnostics(int severity, InteropString diagnostics, Loc location) = 0;
};
//C# class implementing the native interface
public class DiagnosticsListener : MethodDiagnosticListener
{
public override void HandleDiagnostics(int severity, InteropString diagnostics, Loc location)
{ }
}
//CppSharp generated C# (stripped to relevant parts)
public unsafe abstract partial class MethodDiagnosticListener : IDisposable
{
[StructLayout(LayoutKind.Sequential, Size = 8)]
public partial struct __Internal
{
internal __IntPtr vfptr_MethodDiagnosticListener;
}
public __IntPtr __Instance { get; protected set; }
internal static readonly global::System.Collections.Concurrent.ConcurrentDictionary<IntPtr, global::TandemSharp.TandemCompiler.Debug.MethodDiagnosticListener> NativeToManagedMap = new global::System.Collections.Concurrent.ConcurrentDictionary<IntPtr, global::TandemSharp.TandemCompiler.Debug.MethodDiagnosticListener>();
protected bool __ownsNativeInstance;
// DEBUG: MethodDiagnosticListener() = default
protected MethodDiagnosticListener()
{
__Instance = Marshal.AllocHGlobal(sizeof(global::TandemSharp.TandemCompiler.Debug.MethodDiagnosticListener.__Internal));
__ownsNativeInstance = true;
NativeToManagedMap[__Instance] = this;
SetupVTables(GetType().FullName == "TandemSharp.TandemCompiler.Debug.MethodDiagnosticListener");
}
// DEBUG: virtual void HandleDiagnostics(int severity, InteropString diagnostics, Loc location) = 0
public abstract void HandleDiagnostics(int severity, global::TandemSharp.TandemCompiler.InteropString diagnostics, global::TandemSharp.TandemCompiler.Debug.MethodDiagnosticListener.Loc location);
#region Virtual table interop
// virtual ~MethodDiagnosticListener() = default
private static global::TandemSharp.Delegates.Action___IntPtr_int _dtorDelegateInstance;
private static void _dtorDelegateHook(__IntPtr __instance, int delete)
{
var __target = global::TandemSharp.TandemCompiler.Debug.MethodDiagnosticListener.__GetInstance(__instance);
__target.Dispose(disposing: true, callNativeDtor: true);
}
// void HandleDiagnostics(int severity, InteropString diagnostics, Loc location) = 0
private static global::TandemSharp.Delegates.Action___IntPtr_int_TandemSharp_TandemCompiler_InteropString___Internal___IntPtr _HandleDiagnosticsDelegateInstance;
private static void _HandleDiagnosticsDelegateHook(__IntPtr __instance, int severity, global::TandemSharp.TandemCompiler.InteropString.__Internal diagnostics, __IntPtr location)
{
var __target = global::TandemSharp.TandemCompiler.Debug.MethodDiagnosticListener.__GetInstance(__instance);
var __result2 = location != IntPtr.Zero ? global::TandemSharp.TandemCompiler.Debug.MethodDiagnosticListener.Loc.__CreateInstance(location) : default;
__target.HandleDiagnostics(severity, global::TandemSharp.TandemCompiler.InteropString.__CreateInstance(diagnostics), __result2);
}
internal static class VTableLoader
{
private static volatile bool initialized;
private static readonly IntPtr*[] ManagedVTables = new IntPtr*[1];
private static readonly IntPtr*[] ManagedVTablesDtorOnly = new IntPtr*[1];
private static readonly IntPtr[] Thunks = new IntPtr[2];
private static CppSharp.Runtime.VTables VTables;
private static readonly global::System.Collections.Generic.List<CppSharp.Runtime.SafeUnmanagedMemoryHandle>
SafeHandles = new global::System.Collections.Generic.List<CppSharp.Runtime.SafeUnmanagedMemoryHandle>();
static VTableLoader()
{
_dtorDelegateInstance += _dtorDelegateHook;
_HandleDiagnosticsDelegateInstance += _HandleDiagnosticsDelegateHook;
Thunks[0] = Marshal.GetFunctionPointerForDelegate(_dtorDelegateInstance);
Thunks[1] = Marshal.GetFunctionPointerForDelegate(_HandleDiagnosticsDelegateInstance);
}
public static CppSharp.Runtime.VTables SetupVTables(IntPtr instance, bool destructorOnly = false)
{
if (!initialized)
{
lock (ManagedVTables)
{
if (!initialized)
{
initialized = true;
VTables.Tables = new IntPtr[] { *(IntPtr*)(instance + 0) };
VTables.Methods = new Delegate[1][];
ManagedVTablesDtorOnly[0] = CppSharp.Runtime.VTables.CloneTable(SafeHandles, instance, 0, 2);
ManagedVTablesDtorOnly[0][0] = Thunks[0];
ManagedVTables[0] = CppSharp.Runtime.VTables.CloneTable(SafeHandles, instance, 0, 2);
ManagedVTables[0][0] = Thunks[0];
ManagedVTables[0][1] = Thunks[1];
VTables.Methods[0] = new Delegate[2];
}
}
}
if (destructorOnly)
{
*(IntPtr**)(instance + 0) = ManagedVTablesDtorOnly[0];
}
else
{
*(IntPtr**)(instance + 0) = ManagedVTables[0];
}
return VTables;
}
}
protected CppSharp.Runtime.VTables __vtables;
internal virtual CppSharp.Runtime.VTables __VTables
{
get {
if (__vtables.IsEmpty)
__vtables.Tables = new IntPtr[] { *(IntPtr*)(__Instance + 0) };
return __vtables;
}
set {
__vtables = value;
}
}
internal virtual void SetupVTables(bool destructorOnly = false)
{
if (__VTables.IsTransient)
__VTables = VTableLoader.SetupVTables(__Instance, destructorOnly);
}
#endregion
}
Crash occurs upon DiagnosticsListener instantiation
new DiagnosticsListener();
Unless there is some fundamental misconfiguration of the generator on our side, to my understanding this is supported scenario broken by an unfortunate regression, for which I suggest following update to the VTableLoader to address the issue:
internal static class VTableLoader
{
private static volatile IntPtr*[] ManagedVTables;
private static volatile IntPtr*[] ManagedVTablesDtorOnly;
private static readonly IntPtr[] Thunks = new IntPtr[2];
private static CppSharp.Runtime.VTables VTables;
private static readonly global::System.Collections.Generic.List<CppSharp.Runtime.SafeUnmanagedMemoryHandle>
SafeHandles = new global::System.Collections.Generic.List<CppSharp.Runtime.SafeUnmanagedMemoryHandle>();
static VTableLoader()
{
_dtorDelegateInstance += _dtorDelegateHook;
_HandleDiagnosticsDelegateInstance += _HandleDiagnosticsDelegateHook;
Thunks[0] = Marshal.GetFunctionPointerForDelegate(_dtorDelegateInstance);
Thunks[1] = Marshal.GetFunctionPointerForDelegate(_HandleDiagnosticsDelegateInstance);
}
public static CppSharp.Runtime.VTables SetupVTables(IntPtr instance, bool destructorOnly = false)
{
if (destructorOnly)
{
if(ManagedVTablesDtorOnly is null)
{
lock (SafeHandles)
{
if(ManagedVTablesDtorOnly is null)
{
IntPtr*[] vTables = CppSharp.Runtime.VTables.AllocateTable(SafeHandles, 2);
CppSharp.Runtime.VTables.CloneTable(vTables, instance, 0, 2);
vTables[0][0] = Thunks[0];
//Proper release barrier to fix race condition that's present in current code
ManagedVTablesDtorOnly = vTables;
}
}
}
*(IntPtr**)(instance + 0) = ManagedVTablesDtorOnly[0];
}
else
{
if(ManagedVTables is null)
{
lock (SafeHandles)
{
if(ManagedVTables is null)
{
IntPtr*[] vTables = CppSharp.Runtime.VTables.AllocateTable(SafeHandles, 2);
vTables[0][0] = Thunks[0];
vTables[0][1] = Thunks[1];
//Proper release barrier to fix race condition that's present in current code
ManagedVTables = vTables;
}
}
}
*(IntPtr**)(instance + 0) = ManagedVTables[0];
}
//TODO: Handle `VTables.Tables` and `VTables.Methods`, should be initialized only in `destructorOnly`, but I'm not sure about that
return VTables;
}
}
Used settings
Target: MSVC OS: Windows (will reproduce on any platform) This is our setup function
public void Setup(Driver driver)
{
driver.ParserOptions.TargetTriple = "x86_64-pc-windows-msvc";
driver.ParserOptions.LanguageVersion = LanguageVersion.CPP17;
driver.ParserOptions.SetupMSVC();
driver.ParserOptions.EnableRTTI = true;
var options = driver.Options;
options.GeneratorKind = GeneratorKind.CSharp;
}
Stack trace
mscorlib.dll!System.Buffer.Memmove(byte* dest, byte* src, ulong len) Unknown No symbols loaded.
> TandemSharp.dll!CppSharp.Runtime.VTables.CloneTable(System.Collections.Generic.List<CppSharp.Runtime.SafeUnmanagedMemoryHandle> cache, System.IntPtr instance, int offset, int size) Line 55 C# Symbols loaded.
TandemSharp.dll!TandemSharp.TandemCompiler.Debug.MethodDiagnosticListener.VTableLoader.SetupVTables(System.IntPtr instance, bool destructorOnly) Line 570 C# Symbols loaded.
TandemSharp.dll!TandemSharp.TandemCompiler.Debug.MethodDiagnosticListener.SetupVTables(bool destructorOnly) Line 609 C# Symbols loaded.
TandemSharp.dll!TandemSharp.TandemCompiler.Debug.MethodDiagnosticListener.MethodDiagnosticListener() Line 495 C# Symbols loaded.
This report is... breathtaking. We usually suggest to users to contact our support for a quick resolution but the effort you've put in is too impressive. I have an ongoing issue to solve in a day or two and I promise yours is next.
Meanwhile: how much would you say the effort is to isolate the C++ which causes such code?
I report the bugs the same way I want them to be served to me :)
Meanwhile: how much would you say the effort is to isolate the C++ which causes such code?
I'm not sure what you're asking for. I minimized the repro to single C++ interface, the MethodDiagnosticListener in the snippet above (the EXPORT and NO_CM are just disguised dllExport and macro that deletes copy and move ctor/operators, those can be omitted, any virtual class will do), so going to absolute minimum, this should do it.
class MethodDiagnosticListener
{
public:
MethodDiagnosticListener() = default;
virtual ~MethodDiagnosticListener() = default;
virtual void HandleDiagnostics() = 0;
};
So, let me tell you what I did. I copied your C++ to our tests and added a managed subclass:
class MyClass : MethodDiagnosticListener
{
public override void HandleDiagnostics()
{
}
}
Then I called it in a test:
[Test]
public void TestVTableCrash()
{
using (var myClass = new MyClass())
{
myClass.HandleDiagnostics();
}
}
I was expecting a crash to no avail. Apparently I'm missing something so could you tell me what it is?
A second test, with RTTI enabled, failed, and more precisely, passed too.
Can you share generated code as well, especially the VTableLoader?
Of course, here you go: https://pastebin.com/69bFtweU. This is with our master rather than 1.0.0 but I don't think we've changed the v-tables since.
Quickly diffed, it all makes sense now, the difference is here:
protected MethodDiagnosticListener()
{
__Instance = Marshal.AllocHGlobal(sizeof(global::CSharp.MethodDiagnosticListener.__Internal));
__ownsNativeInstance = true;
NativeToManagedMap[__Instance] = this;
>>>>>>>>>>>>>> __Internal.ctor(__Instance); <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
SetupVTables(GetType().FullName == "CSharp.MethodDiagnosticListener");
}
The "vanilla" generator first initializes the native VTable by calling native ctor. It will be mostly garbage (HandleDiagnostics is pure), but still valid 16B to read by VTableLoader and test passes. 1:0 for CppSharp :)
Now in our use case tho, certain interfaces may be missing in native Dll based on build configuration, while managed side may still instantiate the managed interface implementations and happily use them on managed side.
This leads to "EntryPointNotFoundEx" upon call to non-existent native ctor, so we figured that marking pure virtual Class as HasNonTrivialDefaultConstructor = false precisely skips this one line without any complains on the generator side, effectively resolves the EntryPointNotFoundEx and makes us happy devs, as there is nothing we really gained from calling the native ctor anyways, until now 🤔
The pickle is, how to resolve this.
On one side, the originally proposed adjustment to SetupVTables still applies, preserves the previously working use-case, doesn't read partially initialized native VTable aaaaand fixes the lurking race condition ^^
On the other side, it's a change that may introduce different bug 6 months down the line and there is only so many obscure use-cases you can support and only so much you can do to protect users from shooting themselves in the foot, so I'll totally understand if you choose to stick to current "eager" solution and we'll have to deal with it otherwise.
You've changed HasNonTrivialDefaultConstructor yourself? This can lead to any behaviour, this value is supplied by Clang. We're probably at fault too for not making the property read-only but still, it isn't free to do whatever. So I think your real problem is the EntryPointNotFoundException regardless of whether our previous lazy loading is more optimal.
Could you please try our option of CheckSymbols? It generates and builds additional C++ for such implicit symbols (among others) and then removes any members it finds no symbols for.
I can certainly try that, tho looking at the code, I'm really not sure how it should help with this use-case.
When I just do driver.Options.CheckSymbols = true, the generator reports majority of native exported symbols as missing (some reports are valid, some not, I guess I could fix this, prolly some incorrect configuration),
including the MethodDiagnosticListener ctor (so far so good)
Symbol not found: c__N_TandemCompiler_N_Debug_S_MethodDiagnosticListener_MethodDiagnosticListener
and removes it from __Internal DllImport list (good so far), tho at the same time it also removes respective user-facing facade of all removed DllImports, including the default ctor of generated MethodDiagnosticListener (not good), so instead of EntryPointNotFoundException-throwing default ctor, there is now no default ctor and tested syntax
class MyClass : MethodDiagnosticListener
{
public override void HandleDiagnostics()
{ }
}
doesn't compile anymore.
As far as I can tell, it behaves as intended, no objections there, it's just not in line with our use-case, where native interfaces (pure virtual classes) should convert to managed again as pure "interface-like" classes and behave the same regardless if their native counterparts are actually preset in native image shipped along with managed binary or not (e.g. they should not call native methods).
Given that we're drifting away from original regression bug report to what seems to be new use-case/feature request at this point, I'll leave to your own own discretion if you want to acknowledge this as valid feature/use-case or reject it. Naturally we would prefer the first one, but your help was already tremendously useful my understanding how's the generated code intended to work and with this knowledge, we should be able to workaround the issue, so please don't let me block you any more than you consider appropriate. I really don't want to be the one asshole that keeps you from doing some useful work today 😄
Well, in order to either acknowledge or disregard anything I have to know what it really is. So far this hasn't been an argument but rather an investigation.
I forgot to clarify that for checking of symbols to work, you need to add your original C++ libraries in addition to headers (Module.LibraryDirs/Libraries), as if linking them. In addition, please ensure you get the extra symbols as <your-lib>-symbols.<native-lib-extension> in your output dir. If you don't, we might have a bug elsewhere.
Those interface-only C++ classes that you want: well, unfortunately it can't work this way with C++, it's just not the same as C# interfaces. You actually know this yourself because you agree we need to read the entries for destructors in the v-table. So unless I've misunderstood, you just can't make it work without proper C++ support beneath, whether it's your mistake or ours.
can't make it work without proper C++ support beneath we need to read the entries for destructors in the v-table
Do we tho? When I dissect pure virtual class (interface), as far as I can tell, unmanaged image doesn't really contain anything useful as far as managed side should be concerned, does it?
class MethodDiagnosticListener
{
public:
/*
Assigns V-Table (that CppSharp reads, but never uses for pure interfaces and overrides immediately)
=> In theory we don't need to call this, as long as VTableLoader doesn't assume VTable presence in the critical code path
*/
MethodDiagnosticListener()
{
// mov [rcx] __CompilerSpace.VTables.MethodDiagnosticListener
// ret
}
/*
No-op, if we call this during object destruction or not doesn't matter,
and in fact, it doesn't seem to be called during managed object destruction (pastebin.com/69bFtweU Line 92)
*/
virtual ~MethodDiagnosticListener()
{
// ret
}
/*
Technically doesn't even have to be in MethodDiagnosticListener VTable, well defined program can't access it anyways
=> In theory sizeof(VTable MethodDiagnosticListener) can actually be just 1x sizeof(void*) and CppSharp imho risks buffer overrun when assuming 16B (Ofc with current compilers not an issue)
*/
virtual void HandleDiagnostics() = 0;
// int 3
};
With these assumptions, we simply skipped the native ctor/dtor in our project which ensures that single managed binary runs correctly along any native image, regardless if it ships with or without native interfaces compiled in.
I forgot to clarify that for checking of symbols to work, you need to add your original C++ libraries in addition to headers (Module.LibraryDirs/Libraries), as if linking them. In addition, please ensure you get the extra symbols as
-symbols. in your output dir.
Yes, I do have TS-symbols.cpp present, tho if I understand the CheckSymbols correctly (and focus only on the MethodDiagnosticListener), the generator correctly fails to find MethodDiagnosticListener.ctor and removes the respective DllImport (that's good), but takes with it also MethodDiagnosticListener() (default ctor) (this is bad, tested code doesn't compile), so even when I fix the generator configuration, it won't help for this use case, right?
It's starting to sound to me we can only do what you want if we know the C++ is only used as, as you say, interfaces. However, I don't want to make conclusions yet because it's quite late here and I'm tired. Let's continue with the v-tables tomorrow.
About the symbols: as I said, we generate the symbols TS-symbols.cpp but we also auto-build them for you (TS-symbols.dll - assuming windows). If you don't, for any reason, have the latter, of course no symbols can be read and will be considered missing therefore preventing managed wrappers from generating. In case we failed to compile TS-symbols.cpp, it's our bug for which you should have an error log in your console.
Yes, the assumptions work only as long as the native class is state-less. The moment there are any fields the native ctor becomes "non no-op" and needs to be called to initialized them.
The HasNonTrivialDefaultConstructor = false is kind of good (and already working) indicator for this. Whenever Clang or user claim "it's safe to skip the native ctor" and take the responsibility, the native call can be simply skipped.
Second essential part of the use-case is, that the VTableLoader must not assume presence of valid VTable for the destructorOnly = false codepath (In code it's the SetupVTables(GetType().FullName == "Nmspc.MethodDiagnosticListener") in ctor).
This was the case up until the aforementioned 0922217, but does not hold true today and is precisely the change required for this use-case to work again.
I think you mean another commit, the one to introduce the separate VTableLoader is much later. Am I correct in saying your crash is because of CloneTable?
In general, I still think it's a hack to change HasNonTrivialDefaultConstructor but it might be little effort to restore the old lazy loading with a separate branch for the destructor only.
I didn't test it, but I believe that the 0922217 is the first commit that broke the laziness, as described in the opening post. VTableLoader later just inherited and extended upon existing behavior, but that's ofc not important.
Am I correct in saying your crash is because of CloneTable?
Yes. The managed default ctor allocates block of memory, due to skipped call to native ctor this block remains uninitized (this includes uninitialized slot for VTable pointer), CloneTable then tries to read original VTable from this uninitialized memory block and accidently reads random 16B the uninitialized VTable slot happens to point to.
All in all, I don't think we have a bug. You just applied a hack by directly contradicting vital information about the original AST coming from Clang, which hack happened to work for some time. However, as I said, it looks like we can restore the lazy loading which would make your code work again and cost us nothing – except the time and effort to implement. We usually spare these at our own discretion unless the user requests paid support. Since your project is open source too and such an option might be unsuitable to you, you could also try fixing it yourself. I can see you already have quite a grasp of our v-table generating code so I'd be happy to guide you into preparing a PR.
I agree. When we bend the generator pipeline and shoot ourselves to foot during the process, it's not the generators fault :) I'll see when I manage to submit the PR for this.
It's starting to sound to me we can only do what you want if we know the C++ is only used as, as you say, interfaces. However, I don't want to make conclusions yet because it's quite late here and I'm tired. Let's continue with the v-tables tomorrow. About the symbols: as I said, we generate the symbols
TS-symbols.cppbut we also auto-build them for you (TS-symbols.dll- assuming windows). If you don't, for any reason, have the latter, of course no symbols can be read and will be considered missing therefore preventing managed wrappers from generating. In case we failed to compileTS-symbols.cpp, it's our bug for which you should have an error log in your console.
QQ: Is it my understanding correct that "TS-symbols.cpp" is auto-built only for Windows, and we need to manually build it for Linux?
It's starting to sound to me we can only do what you want if we know the C++ is only used as, as you say, interfaces. However, I don't want to make conclusions yet because it's quite late here and I'm tired. Let's continue with the v-tables tomorrow. About the symbols: as I said, we generate the symbols
TS-symbols.cppbut we also auto-build them for you (TS-symbols.dll- assuming windows). If you don't, for any reason, have the latter, of course no symbols can be read and will be considered missing therefore preventing managed wrappers from generating. In case we failed to compileTS-symbols.cpp, it's our bug for which you should have an error log in your console.QQ: Is it my understanding correct that "TS-symbols.cpp" is auto-built only for Windows, and we need to manually build it for Linux?
Yes, atm that is correct.