ODR problem with boost iostreams
We are about to use boost iostreams for the first time in our code base, and we are running into a nasty ODR problem with ni-media. The issue is that ni-media redefines BOOST_IOSTREAMS_BASIC_STREAMBUF; this causes the boost::iostreams::detail::linked_streambuf class to have a different size for compilation units that include ni-media headers vs those that don't. This is an ODR violation. When creating a Windows build with Link Time Code Generation turned on (/LTCG), this causes the following linker warning:
LINK : warning C4743: 'boost::iostreams::detail::linked_streambuf<char,struct std::char_traits<char> >::`RTTI Base Class Array'' has different size in '<one-of-our-non-ni-media-cpp-files>' and 'f:[...]\ni-media\dist\audiostream\src\ni\media\audio\ifstream.cpp': 20 and 28 bytes
We might be able to work around this by patching boost, but it would be preferable to fix it in ni-media in a way that no redefinition of BOOST_IOSTREAMS_BASIC_STREAMBUF is necessary. The reason why it is redefined is only for the friend class istream declaration, which in turn is only needed so that istream::operator>> can use protected methods such as underflow, gptr, and egptr in its implementation. This was introduced in 124bbf07d1; unfortunately, it's hard to tell whether that change was made to fix a defect, or to improve performance, or both.
We briefly looked into reimplementing the code without access to those protected methods, but failed because we don't understand the code well enough. For example, we didn't understand why this assertion is guaranteed to hold after a call to underflow.
@marcrambo Any thoughts?
We might be able to work around this by patching boost, but it would be preferable to fix it in ni-media in a way that no redefinition of BOOST_IOSTREAMS_BASIC_STREAMBUF is necessary.
Yeah this redefinition is pretty ugly. It was meant as a temporary workaround and was added in the hot phase of ni-media's initial release.
This was introduced in https://github.com/NativeInstruments/ni-media/commit/124bbf07d1db3a8753386c3844942f80877d1c37; unfortunately, it's hard to tell whether that change was made to fix a defect, or to improve performance, or both.
The change was a pcm conversion performance optimization and IIRC also fixed an EOF handling problem. Previously the pcm::iterator wrapped an std::istreambuf_iterator which is a single pass input iterator. The pcm::iterator specialization for that iterator category performs poorly (by nature). By accessing protected underflow, gptr, and egptr we can tap into the streambuf's internal buffer, and use a random access iterator.
We briefly looked into reimplementing the code without access to those protected methods, but failed because we don't understand the code well enough. For example, we didn't understand why this assertion is guaranteed to hold after a call to underflow.
This is because we always create a streambuf with a frame aligned buffer ( 4096 frames by default) https://github.com/NativeInstruments/ni-media/blob/e3074cac962ad320495cdeaf1ddd36aa6e8f3500/audiostream/src/ni/media/iostreams/stream_buffer.h#L97
@marcrambo Any thoughts?
Unfortunately I don't see a simple solution to get rid of the redefinition. I think this would require some deeper changes.
Perhaps it's worth investigating why the linker complains exactly, because we only add the friend declaration so it seems the layout shouldn't change. https://github.com/NativeInstruments/ni-media/blob/e3074cac962ad320495cdeaf1ddd36aa6e8f3500/audiostream/inc/ni/media/audio/streambuf.h#L33
Perhaps it's worth investigating why the linker complains exactly, because we only add the friend declaration so it seems the layout shouldn't change.
Yes, the layout of the class itself doesn't change, but some of its associated RTTI data does, specifically the "RTTI Base Class Array". Which makes sense, because you are introducing an extra inheritance step.
I don't know what this RTTI Base Class Array is used for (I can only guess, things like dynamic downcasting probably), and I suspect it may not actually have any bad consequences in practice. But you never know (we have learned from experience to take any kind of ODR problem very seriously). And it prevents us from making LTCG builds on Windows.
Yes, the layout of the class itself doesn't change, but some of its associated RTTI data does, specifically the "RTTI Base Class Array". Which makes sense, because you are introducing an extra inheritance step.
Ah I see, the linker only complains about the RTTI discrepancy. Yeah, that makes sense.
Have you tried disabling LTCG for ni-media's ifstream.cpp only? Perhaps something like this could work:
set_source_files_properties(ifstream.cpp PROPERTIES COMPILE_FLAGS /LTCG:OFF)
Have you tried disabling LTCG for ni-media's ifstream.cpp only?
No, we haven't tried that. We aren't using CMake, and I don't think the build system we are using allows per-file compiler flags. But also, I don't think this would help, because unless I'm mistaken the problem not only exists with ifstream.cpp, but with any translation unit that includes an ni-media header.
But also, I don't think this would help, because unless I'm mistaken the problem not only exists with ifstream.cpp, but with any translation unit that includes an ni-media header.
stream_buffer.h where the redefinition happens is not part of ni-media's public interface. We don't export that header during installation so external projects cannot include it. So including any public ni-media header should be fine as the redefined BOOST_IOSTREAMS_BASIC_STREAMBUF is never exposed.
From my understanding the problem occurs when linking an executable (or library) with ni-media. When LTCG is enabled in both targets the linker will see the same symbol, and complain about the discrepancy. Disabling LTCG in ni-media as a whole or only in the affected source files should fix it (i.e. https://github.com/search?q=repo%3ANativeInstruments%2Fni-media+make_stream_buffer&type=code) This way you could still continue using LTCG in the rest of your project.
stream_buffer.hwhere the redefinition happens is not part of ni-media's public interface. We don't export that header during installation so external projects cannot include it. So including any public ni-media header should be fine as the redefined BOOST_IOSTREAMS_BASIC_STREAMBUF is never exposed.
You are right, I misread the code. But still:
From my understanding the problem occurs when linking an executable (or library) with ni-media. When LTCG is enabled in both targets the linker will see the same symbol, and complain about the discrepancy. Disabling LTCG in ni-media as a whole or only in the affected source files should fix it
No, it only fixes the warning, but not the ODR problem itself; in other words, it would only be a solution if we assume that the ODR problem is harmless, which isn't entirely clear. Without LTCG, the linker will arbitrarily pick one or the other instantiation, which will be the wrong one for half of the translation units. With LTCG, the same thing happens, but now the linker is able to warn about it. At least that's my understanding of the issue.
I suppose another way to solve it would be to build ni-media as a DLL, but we'd like to avoid that if possible.
I don't think there is an ODR violation in ni-media. By redefining BOOST_IOSTREAMS_BASIC_STREAMBUF we inject our own streambuf class. While ugly I believe this is perfectly legal as the class is only defined once and we use it consistently throughout ni-media. Without this trick we would have to patch boost (which would be uglier).
We don't expose any of this to the outside world. We actually don't even expose any of boost to the outside world, as we link boost privately: https://github.com/NativeInstruments/ni-media/blob/e3074cac962ad320495cdeaf1ddd36aa6e8f3500/audiostream/CMakeLists.txt#L285.
I don't know the details of your build system but it seems that you are building your internal modules and 3rd party modules like ni-media using LTCG. From my understanding whole program optimization will treat your codebase as if there was only one module. So if two different modules define a class with the same name but with different types, there is a problem. This is something you can easily prevent by putting everything in namespaces in your internal codebase, but you have no control over 3rd party libraries. By linking external libraries (especially C libraries) with LTCG you increase the risk of name collisions, which is exactly what happens here.
I don't think there is an ODR violation in ni-media.
Not within ni-media itself, on its own, that's true. But you are making it impossible for an application to link ni-media statically AND use boost iostreams in other parts of the application; this will cause ODR violations, and there's nothing the application can do about it except patch boost, or link ni-media as a DLL.
We don't expose any of this to the outside world. We actually don't even expose any of boost to the outside world, as we link boost privately
This is where you are mistaken. Marking the boost dependency as PRIVATE only affects compilation (i.e. I don't have to make the boost header search paths available to my translation units in order to compile code that uses ni-media), but when linking statically the dependency is still very much treated as public. Please read this explanation, it explains this quite well.
By linking external libraries (especially C libraries) with LTCG you increase the risk of name collisions,
No, we are not talking about a name collision here, and using namespaces has nothing to do with it. We are talking about the same symbol in the same namespaces being compiled in two different ways in different translation units; this IS an ODR violation, and LTCG has nothing to do with it. The only reason that LTCG comes up in this context is that turning on LTCG enables the linker to detect the problem and warn about it. It makes no difference to the actual consequences of the ODR problem itself though, which is that the linker arbitrarily picks one or the other of the two conflicting instantiations in a static build.
You are right. Seeing no ODR violation in ni-media itself, I jumped to conclusions too quickly thinking that it must be ni-media built with /LTCG that is leaking that (supposedly) internal definition.
So, even though BOOST_IOSTREAMS_BASIC_STREAMBUF is a template class and is defined in the header only part of boost::iostreams it is not OK to modify it. As soon as that class gets instantiated in a translation unit in ni-media (ifstream.cpp etc) it will end up in the static library. Even if we never expose that modified definition in a public ni-media header, another target will see it when linking ni-media statically.
So if a translation unit uses ni-media :
-
but doesn’t include
boost/iostreams/stream_buffer.hthe ni-media definition will be used. - and includes the
boost/iostreams/stream_buffer.hthe boost definition will be used. The class is already fully defined by including the boost header, so the linker will not go look elsewhere.
And that is an ODR violation.