pacemaker icon indicating copy to clipboard operation
pacemaker copied to clipboard

Unit tests for some env option functions

Open nrwahl2 opened this issue 3 years ago • 3 comments

Closes T493.

nrwahl2 avatar Aug 15 '22 10:08 nrwahl2

Build failure is just fedora rawhide RPM

nrwahl2 avatar Aug 15 '22 21:08 nrwahl2

The related PRs have been merged, this can be rebased and updated

kgaillot avatar Aug 17 '22 20:08 kgaillot

@clumens , could you review this one when you get a chance? Thanks

kgaillot avatar Aug 18 '22 15:08 kgaillot

Updated to address review.

  • Check snprintf() for negative return value and respond as follows (https://github.com/ClusterLabs/pacemaker/pull/2801#discussion_r949286427):
    • Log an error message.
    • Return NULL from pcmk__env_option().
    • Return from pcmk__set_env_option(). This function should probably be updated to return a standard Pacemaker return code, and its callers (at least some) should check the return code. That's outside the scope of this PR.
  • Mock snprintf() and include tests for the above cases.
  • Simplify tests disabled_null() test for pcmk__env_option_enabled(). Only use one input condition that causes pcmk__env_option() to return NULL. (https://github.com/ClusterLabs/pacemaker/pull/2801#discussion_r949292623)
  • Use PCMK__NELEM() in loop control. (https://github.com/ClusterLabs/pacemaker/pull/2801#discussion_r949339751)
  • Use separate mock booleans for getenv(), setenv(), unsetenv(). (https://github.com/ClusterLabs/pacemaker/pull/2801#discussion_r949354714)
  • Allow mocked setenv() and unsetenv() to set errno based on will_return(). (https://github.com/ClusterLabs/pacemaker/pull/2801#discussion_r949356975)

nrwahl2 avatar Aug 21 '22 02:08 nrwahl2

Really strange test errors... in both Ubuntus and in F36 Power9:

  • snprintf_error() in pcmk__env_option_test.c results in a call to getenv(), even though pcmk__env_option() returns before calling getenv() when snprintf() returns a negative value.
19:59:04 not ok 2 - snprintf_error
19:59:04 FAIL: pcmk__env_option_test 2 - snprintf_error
19:59:04 # No entries for symbol __wrap_getenv.
19:59:04 # mock.c:90: error: Could not get value to check parameter name of function __wrap_getenv
19:59:04 # There were no previously declared parameter values for this test.
  • snprintf_error() in pcmk__set_env_option_test.c does not result in any call to snprintf(), even though pcmk__set_env_option() explicitly calls snprintf().
19:59:04 FAIL: pcmk__set_env_option_test 2 - snprintf_error
19:59:04 # %s() has remaining non-returned values.
19:59:04 # pcmk__set_env_option_test.c:49: note: remaining item was declared here
19:59:04 # __wrap_snprintf: '%s' parameter still has values that haven't been checked.
19:59:04 # pcmk__set_env_option_test.c:46: note: remaining item was declared here
19:59:04 # '%s' parameter still has values that haven't been checked.
19:59:04 # pcmk__set_env_option_test.c:47: note: remaining item was declared here
19:59:04 # '%s' parameter still has values that haven't been checked.
19:59:04 # pcmk__set_env_option_test.c:48: note: remaining item was declared here

All the other distros work fine. It's not an endianness issue, since the affected systems are x86_64. Things I've tried:

  • Removing restrict from the parameters in the __wrap_snprintf() signature, to make sure it doesn't mess with the calls to vsnprintf().
  • Set up an Ubuntu 22.04 LTS VM and confirmed that __wrap_snprintf() never gets called. Apparently it's ignoring the wrapper completely.
  • I found this, but the solution doesn't help; I'm already passing arguments after the format string: https://stackoverflow.com/a/39189436/7660197

Edit: It is a compiler optimization issue... setting -O0 in CFLAGS makes it go away. It looks like on Ubuntu, gcc is replacing snprint() with __builtin___snprintf_chk(). That's fine and dandy except that -fno-builtin is not making it go away like it should. I haven't found any answers yet. Ubuntu uses GCC v11 and Fedora 36 uses GCC v12. I don't know what versions our other test machines use but I'm sure the EL7 machines are on older versions. I reached out to the gcc-help mailing list. We could always skip testing that code path for now, but we need to test it eventually, which will require mocking snprintf().

Edit: Adding -U_FORTIFY_SOURCE to CFLAGS fixed the Ubuntu issue. Added a commit to this PR to do that. F36 Power9 LE is still failing with the same errors listed above, and I don't have a reproducer system for that currently.

nrwahl2 avatar Aug 21 '22 19:08 nrwahl2

I got my hands on F36 ppc64le. This change to add 128-bit floating point is what breaks the "snprintf_error" unit tests there: https://fedoraproject.org/wiki/Changes/PPC64LE_Float128_Transition

Running gdb on a minimal reproducer:

Breakpoint 1, func (s=s@entry=0x10020020 "hello world") at test.c:5
5	    snprintf(buf, 16, "%s", s);
(gdb) s
0x00007ffff7d54a98 in ___ieee128_snprintf (s=0x7fffffffeac0 "\360\352\377\377\377\177", maxlen=16, format=0x10020010 "%s") at ../sysdeps/ieee754/ldbl-128ibm-compat/ieee128-snprintf.c:24

Special cases and hidden overrides like this may keep popping up for snprintf() in the future. Maybe use a different approach for mocking stuff like Xprintf?

I think I should remove the snprintf() mocking code and the two unit tests that use it tomorrow, and postpone that to another day. I've come up with some solutions but I need to play with it a little more. Everything feels kind of hacky.

Fabio pointed out another approach that was used in libqb that might work: https://github.com/ClusterLabs/libqb/pull/231/commits/f610b1b1. Note _failure_injection.c. I haven't tried it yet.

nrwahl2 avatar Aug 22 '22 05:08 nrwahl2

Removed the -U_FORTIFY_SOURCE and __wrap_snprintf() commits from this PR. Mocking snprintf() will require a different approach or a hack, so it's better addressed separately. I also removed the snprintf_error() test cases from the pcmk__env_option() and pcmk__set_env_option() unit test files. We can add those back when snprintf() is taken care of.

As of right now, __wrap_setenv() and __wrap_unsetenv() still don't call __real_X, based on my comment in https://github.com/ClusterLabs/pacemaker/pull/2801#discussion_r951846421 about symmetry and the lack of need to use the real environment (we have a mocked getenv() we can use). If either you or @clumens feels strongly about calling __real_X, I can easily update both __wrap_X functions to make that call.

nrwahl2 avatar Aug 22 '22 20:08 nrwahl2