Channel tests failing for variant=release with GCC
Following addition of variant=debug,release to Travis CI in #50, the GCC 5.4.1 build job is failing due to failing channel tests, see https://travis-ci.org/boostorg/gil/jobs/355009262
testing.capture-output bin.v2/libs/gil/test/channel.test/gcc-5.4.1/release/channel.run
====== BEGIN OUTPUT ======
std::exception
EXIT STATUS: 1
====== END OUTPUT ======
Interestingly, the same tests built with clang 4.2.1 are passing without any failures, see https://travis-ci.org/boostorg/gil/jobs/355009263
I confirmed this issue locally, on Ubuntu with GCC 5.4.0
b2 variant=debug,release cxxflags="-Wno-unused-local-typedefs" test
...
testing.capture-output ../../bin.v2/libs/gil/test/channel.test/gcc-5.4.0/debug/channel.run
**passed** ../../bin.v2/libs/gil/test/channel.test/gcc-5.4.0/debug/channel.test
...
testing.capture-output ../../bin.v2/libs/gil/test/channel.test/gcc-5.4.0/release/channel.run
====== BEGIN OUTPUT ======
std::exception
EXIT STATUS: 1
====== END OUTPUT ======
References
Yet another issue revealed by variant=release build configuration.
Previous one was the image checksum tests, see #49
More info related to this bug in https://github.com/boostorg/gil/pull/50#issuecomment-376719681
The error occurs when compiling the channel.cpp test with g++ -O3.
A noteworthy observation: Using -Wall yields these warnings, which may be related:
../../boost/gil/channel.hpp: In function ‘void test_packed_channel_reference()’:
../../boost/gil/channel.hpp:441:79: warning: ‘*((void*)&<anonymous> +16)’ is used uninitialized in this function [-Wuninitialized]
void set_unsafe(integer_t value) const { this->set_data((this->get_data() & ~channel_mask) | (( static_cast< BitField >( value )<<FirstBit))); }
~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~
../../boost/gil/channel.hpp:441:79: warning: ‘*((void*)&<anonymous> +16)’ may be used uninitialized in this function [-Wmaybe-uninitialized]
void set_unsafe(integer_t value) const { this->set_data((this->get_data() & ~channel_mask) | (( static_cast< BitField >( value )<<FirstBit))); }
~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~
../../boost/gil/channel.hpp:441:79: warning: ‘*((void*)&<anonymous> +16)’ may be used uninitialized in this function [-Wmaybe-uninitialized]
void set_unsafe(integer_t value) const { this->set_data((this->get_data() & ~channel_mask) | (( static_cast< BitField >( value )<<FirstBit))); }
~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~
../../boost/gil/channel.hpp: In function ‘void test_packed_dynamic_channel_reference()’:
../../boost/gil/channel.hpp:563:42: warning: ‘*((void*)&<anonymous> +32)’ is used uninitialized in this function [-Wuninitialized]
this->set_data((this->get_data() & ~channel_mask) | value<<_first_bit);
~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~
which confuses me. I don't even understand which of the referenced variables / values the compiler believes to be uninitialized...
So, I made similar observation while compiling variant=release (aka -O3) for the test/pixel.cpp, but this time the very same channel.hpp:441:79 warning is giving up more details:
$ toolset=gcc-7 cxxflags="-std=c++11" variant=release test//pixel
../../boost/gil/channel.hpp:441:79: warning: ‘v2’ is used uninitialized in this function [-Wuninitialized]
void set_unsafe(integer_t value) const { this->set_data((this->get_data() & ~channel_mask) | (( static_cast< BitField >( value )<<FirstBit))); }
~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~
test/pixel.cpp:91:24: note: ‘v2’ was declared here
pixel2_value_t v2(C2::_pixel);
^~
In file included from ../../boost/gil/planar_pixel_reference.hpp:29:0,
from test/pixel.cpp:20:
../../boost/gil/channel.hpp:441:79: warning: ‘pv2’ is used uninitialized in this function [-Wuninitialized]
void set_unsafe(integer_t value) const { this->set_data((this->get_data() & ~channel_mask) | (( static_cast< BitField >( value )<<FirstBit))); }
~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~
../../boost/gil/channel.hpp: In function ‘void test_packed_pixel()’:
../../boost/gil/channel.hpp:410:63: warning: ‘p121’ is used uninitialized in this function [-Wuninitialized]
integer_t get() const { return integer_t((this->get_data()&channel_mask) >> FirstBit); }
~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~
../../boost/gil/channel.hpp:441:79: warning: ‘<anonymous>’ is used uninitialized in this function [-Wuninitialized]
void set_unsafe(integer_t value) const { this->set_data((this->get_data() & ~channel_mask) | (( static_cast< BitField >( value )<<FirstBit))); }
~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~
../../boost/gil/channel.hpp:441:79: warning: ‘<anonymous>’ is used uninitialized in this function [-Wuninitialized]
void set_unsafe(integer_t value) const { this->set_data((this->get_data() & ~channel_mask) | (( static_cast< BitField >( value )<<FirstBit))); }
~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~
The p121 from the warning above, for example, is rgb121_pixel_t p121; in test/pixel.cpp. Possibly, a red herring.
Now, regarding the *((void*)&<anonymous> +16) in your warnings, interestingly there is similar report https://github.com/boostorg/python/issues/180 where the boost/python/extract.hpp:185 is
return *(T*)(
// Only do the stage2 conversion once
m_data.stage1.convertible == m_data.storage.bytes
? m_data.storage.bytes
: (rvalue_from_python_stage2)(m_source, m_data.stage1, registered<T>::converters)
);
This *(T*) rings a bell from packed_channel_reference_base member:
const bitfield_t& get_data() const { return *static_cast<const bitfield_t*>(_data_ptr); }
Hmm, aren't we having an aliasing situation here?
Finally, I don't think we can consider the channel bug as killed. See these test/channel builds for variant=release and variant=debug
- GCC 7

- clang 5.0

I'll remind that the original report https://github.com/boostorg/gil/issues/51#issue-306247078 is due to the test/channel failing with GCC 5.4.1 and passing with clang 4.2.1 :-)
We seem to be chasing some undefined behaviour, so that theory about aliasing issues seem plausible.
Indeed. It might be worthwhile trying to compile with g++ -fno-strict-aliasing -O3 to see whether that works around the failure seen with g++ -O3.
Can someone please provide a minimal reproducible code sample?
@chhenning I've been trying smaller extracts from the failing test but none seem to reproduce the problem. Seems like it's the kind of added cout and works mystery.
@stefanseefeld @chhenning I've just realised that merging of #83 (the bitsN -> <cstdint> refactoring) seems to have fixed the channel tests failures!
Looking at the history of our CircleCI workflows:
-
last CircleCI build where the test is failing for the GCC 5.01: https://circleci.com/workflow-run/42b255d9-5f6b-4023-9965-c52d605a811e
-
first CircleCI build where the test is passing on very compiler: https://circleci.com/workflow-run/0c2440f9-3367-4f96-a206-2f467df40ba9
How about that?!
Great!
Are you using any fancy compiler switches like the ones @stefanseefeld mentioned?
You call '-O3' fancy ??? :-)
On Fri, Apr 13, 2018, 18:49 chhenning, [email protected] wrote:
Great!
Are you using any fancy compiler switches like the ones @stefanseefeld https://github.com/stefanseefeld mentioned?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/boostorg/gil/issues/51#issuecomment-381278596, or mute the thread https://github.com/notifications/unsubscribe-auth/ABL411CN7mDM_ss89k9LNG71ulrCVY0sks5toSt2gaJpZM4SvOrc .
@chhenning during local testing I tried all sorts of switches, optimisation levels, with and without strict aliasing etc.
On CircleCI all builds are using the same switches as defined in our root Jamfile https://github.com/boostorg/gil/blob/develop/Jamfile
And there are two build variants: debug (those run first) and release with optimisation=speed (those run if corresponding debug pass, hence you see two walls of builds).
However, it's not quite calming to me what we're observing. If we assume there is no GCC 5.x bug, and we take into account my #83 changes did just rename the same types/aliases w/o correcting any casts, or other issues, then there still might be a bug hiding :)
:-)
This one I call fancy: -fno-strict-aliasing
@mloskot you are correct. What about the other bug?
@chhenning No good news yet on the checksum bug from me, it still is failing on VS2017 64bit release build, see the AppVeyor. Got busy with the bitsN refactoring and other things.
Not sure how relevant this issue is today, but the warning seems to indicate exactly what it says: a pixel value is used uninitialized. The tests seems to have been reorganized though so it's not easy to track now.