Unit tests for some env option functions
Closes T493.
Build failure is just fedora rawhide RPM
The related PRs have been merged, this can be rebased and updated
@clumens , could you review this one when you get a chance? Thanks
Updated to address review.
- Check
snprintf()for negative return value and respond as follows (https://github.com/ClusterLabs/pacemaker/pull/2801#discussion_r949286427):- Log an error message.
- Return
NULLfrompcmk__env_option(). - Return from
pcmk__set_env_option(). This function should probably be updated to return a standard Pacemaker return code, and its callers (at least some) should check the return code. That's outside the scope of this PR.
- Mock
snprintf()and include tests for the above cases. - Simplify tests
disabled_null()test forpcmk__env_option_enabled(). Only use one input condition that causespcmk__env_option()to returnNULL. (https://github.com/ClusterLabs/pacemaker/pull/2801#discussion_r949292623) - Use
PCMK__NELEM()in loop control. (https://github.com/ClusterLabs/pacemaker/pull/2801#discussion_r949339751) - Use separate mock booleans for
getenv(),setenv(),unsetenv(). (https://github.com/ClusterLabs/pacemaker/pull/2801#discussion_r949354714) - Allow mocked
setenv()andunsetenv()to seterrnobased onwill_return(). (https://github.com/ClusterLabs/pacemaker/pull/2801#discussion_r949356975)
Really strange test errors... in both Ubuntus and in F36 Power9:
-
snprintf_error()inpcmk__env_option_test.cresults in a call togetenv(), even thoughpcmk__env_option()returns before callinggetenv()whensnprintf()returns a negative value.
19:59:04 not ok 2 - snprintf_error
19:59:04 FAIL: pcmk__env_option_test 2 - snprintf_error
19:59:04 # No entries for symbol __wrap_getenv.
19:59:04 # mock.c:90: error: Could not get value to check parameter name of function __wrap_getenv
19:59:04 # There were no previously declared parameter values for this test.
-
snprintf_error()inpcmk__set_env_option_test.cdoes not result in any call tosnprintf(), even thoughpcmk__set_env_option()explicitly callssnprintf().
19:59:04 FAIL: pcmk__set_env_option_test 2 - snprintf_error
19:59:04 # %s() has remaining non-returned values.
19:59:04 # pcmk__set_env_option_test.c:49: note: remaining item was declared here
19:59:04 # __wrap_snprintf: '%s' parameter still has values that haven't been checked.
19:59:04 # pcmk__set_env_option_test.c:46: note: remaining item was declared here
19:59:04 # '%s' parameter still has values that haven't been checked.
19:59:04 # pcmk__set_env_option_test.c:47: note: remaining item was declared here
19:59:04 # '%s' parameter still has values that haven't been checked.
19:59:04 # pcmk__set_env_option_test.c:48: note: remaining item was declared here
All the other distros work fine. It's not an endianness issue, since the affected systems are x86_64. Things I've tried:
- Removing
restrictfrom the parameters in the__wrap_snprintf()signature, to make sure it doesn't mess with the calls tovsnprintf(). - Set up an Ubuntu 22.04 LTS VM and confirmed that
__wrap_snprintf()never gets called. Apparently it's ignoring the wrapper completely. - I found this, but the solution doesn't help; I'm already passing arguments after the format string: https://stackoverflow.com/a/39189436/7660197
Edit: It is a compiler optimization issue... setting -O0 in CFLAGS makes it go away. It looks like on Ubuntu, gcc is replacing snprint() with __builtin___snprintf_chk(). That's fine and dandy except that -fno-builtin is not making it go away like it should. I haven't found any answers yet. Ubuntu uses GCC v11 and Fedora 36 uses GCC v12. I don't know what versions our other test machines use but I'm sure the EL7 machines are on older versions. I reached out to the gcc-help mailing list. We could always skip testing that code path for now, but we need to test it eventually, which will require mocking snprintf().
Edit: Adding -U_FORTIFY_SOURCE to CFLAGS fixed the Ubuntu issue. Added a commit to this PR to do that. F36 Power9 LE is still failing with the same errors listed above, and I don't have a reproducer system for that currently.
I got my hands on F36 ppc64le. This change to add 128-bit floating point is what breaks the "snprintf_error" unit tests there:
https://fedoraproject.org/wiki/Changes/PPC64LE_Float128_Transition
Running gdb on a minimal reproducer:
Breakpoint 1, func (s=s@entry=0x10020020 "hello world") at test.c:5
5 snprintf(buf, 16, "%s", s);
(gdb) s
0x00007ffff7d54a98 in ___ieee128_snprintf (s=0x7fffffffeac0 "\360\352\377\377\377\177", maxlen=16, format=0x10020010 "%s") at ../sysdeps/ieee754/ldbl-128ibm-compat/ieee128-snprintf.c:24
Special cases and hidden overrides like this may keep popping up for snprintf() in the future. Maybe use a different approach for mocking stuff like Xprintf?
I think I should remove the snprintf() mocking code and the two unit tests that use it tomorrow, and postpone that to another day. I've come up with some solutions but I need to play with it a little more. Everything feels kind of hacky.
Fabio pointed out another approach that was used in libqb that might work: https://github.com/ClusterLabs/libqb/pull/231/commits/f610b1b1. Note _failure_injection.c. I haven't tried it yet.
Removed the -U_FORTIFY_SOURCE and __wrap_snprintf() commits from this PR. Mocking snprintf() will require a different approach or a hack, so it's better addressed separately. I also removed the snprintf_error() test cases from the pcmk__env_option() and pcmk__set_env_option() unit test files. We can add those back when snprintf() is taken care of.
As of right now, __wrap_setenv() and __wrap_unsetenv() still don't call __real_X, based on my comment in https://github.com/ClusterLabs/pacemaker/pull/2801#discussion_r951846421 about symmetry and the lack of need to use the real environment (we have a mocked getenv() we can use). If either you or @clumens feels strongly about calling __real_X, I can easily update both __wrap_X functions to make that call.