daos icon indicating copy to clipboard operation
daos copied to clipboard

DAOS-623 build: Use space in meson options

Open jolivier23 opened this issue 1 year ago • 16 comments

While = may work, in the documentation, spaces are used so let's try this and see if it fixes build issues we've seen.

Before requesting gatekeeper:

  • [ ] Two review approvals and any prior change requests have been resolved.
  • [ ] Testing is complete and all tests passed or there is a reason documented in the PR why it should be force landed and forced-landing tag is set.
  • [ ] Features: (or Test-tag*) commit pragma was used or there is a reason documented that there are no appropriate tags for this PR.
  • [ ] Commit messages follows the guidelines outlined here.
  • [ ] Any tests skipped by the ticket being addressed have been run and passed in the PR.

Gatekeeper:

  • [ ] You are the appropriate gatekeeper to be landing the patch.
  • [ ] The PR has 2 reviews by people familiar with the code, including appropriate owners.
  • [ ] Githooks were used. If not, request that user install them and check copyright dates.
  • [ ] Checkpatch issues are resolved. Pay particular attention to ones that will show up on future PRs.
  • [ ] All builds have passed. Check non-required builds for any new compiler warnings.
  • [ ] Sufficient testing is done. Check feature pragmas and test tags and that tests skipped for the ticket are run and now pass with the changes.
  • [ ] If applicable, the PR has addressed any potential version compatibility issues.
  • [ ] Check the target branch. If it is master branch, should the PR go to a feature branch? If it is a release branch, does it have merge approval in the JIRA ticket.
  • [ ] Extra checks if forced landing is requested
    • [ ] Review comments are sufficiently resolved, particularly by prior reviewers that requested changes.
    • [ ] No new NLT or valgrind warnings. Check the classic view.
    • [ ] Quick-build or Quick-functional is not used.
  • [ ] Fix the commit message upon landing. Check the standard here. Edit it to create a single commit. If necessary, ask submitter for a new summary.

jolivier23 avatar Feb 26 '24 21:02 jolivier23

Ticket title is 'Generic ticket for minor code cleanup and improvement' Status is 'Resolved' Labels: 'request_for_2.4.3' https://daosio.atlassian.net/browse/DAOS-623

github-actions[bot] avatar Feb 26 '24 21:02 github-actions[bot]

This is fine for diagnostics but we shouldn't be landing it unless we can prove it solves the problem.

sure, I suppose that remains to be seen. @jamesanunez are you able to reproduce with this patch?

jolivier23 avatar Mar 04 '24 03:03 jolivier23

This is fine for diagnostics but we shouldn't be landing it unless we can prove it solves the problem.

sure, I suppose that remains to be seen. @jamesanunez are you able to reproduce with this patch?

This does seem to resolve a similar issue on Frontera. Without this PR, meson tries to configure fuse in /usr. But with this, it correctly uses my local prefix

daltonbohning avatar Mar 07 '24 17:03 daltonbohning

This is fine for diagnostics but we shouldn't be landing it unless we can prove it solves the problem.

sure, I suppose that remains to be seen. @jamesanunez are you able to reproduce with this patch?

This does seem to resolve a similar issue on Frontera. Without this PR, meson tries to configure fuse in /usr. But with this, it correctly uses my local prefix

If you can verify this change fixes the problem then yes, let's land it, absolutely.

ashleypittman avatar Mar 07 '24 17:03 ashleypittman

This does seem to resolve a similar issue on Frontera. Without this PR, meson tries to configure fuse in /usr. But with this, it correctly uses my local prefix

If you can verify this change fixes the problem then yes, let's land it, absolutely.

Actually. I suspect some kind of race condition here. It sometimes works and sometimes does not. I just had another build fail with

RUN: ninja install
[0/1] Installing files.
Installing lib/libfuse3.so.3.16.2 to /usr/local/lib64
Installation failed due to insufficient permissions.
Attempting to use polkit to gain elevated privileges...
Error executing command as another user: Not authorized

And at least the fuse configure looks right

meson configure --prefix /work2/08126/dbohninx/frontera/BUILDS/dev/20240307/daos/install/prereq/release/fuse -D disable-mtab=True -D udevrulesdir=/work2/08126/dbohninx/frontera/BUILDS/dev/20240307/daos/install/prereq/release/fuse/udev -D utils=False --default-library both

daltonbohning avatar Mar 07 '24 17:03 daltonbohning

Maybe ninja install needs to specify a prefix too?

daltonbohning avatar Mar 07 '24 17:03 daltonbohning

Maybe ninja install needs to specify a prefix too?

Nevermind. According to https://stackoverflow.com/questions/62475190/meson-and-ninja-build-system-specify-where-binaries-are-stored the prefix should be set with meson, not ninja.

daltonbohning avatar Mar 07 '24 17:03 daltonbohning

This does seem to resolve a similar issue on Frontera. Without this PR, meson tries to configure fuse in /usr. But with this, it correctly uses my local prefix

If you can verify this change fixes the problem then yes, let's land it, absolutely.

Actually. I suspect some kind of race condition here. It sometimes works and sometimes does not. I just had another build fail with

RUN: ninja install
[0/1] Installing files.
Installing lib/libfuse3.so.3.16.2 to /usr/local/lib64
Installation failed due to insufficient permissions.
Attempting to use polkit to gain elevated privileges...
Error executing command as another user: Not authorized

And at least the fuse configure looks right

meson configure --prefix /work2/08126/dbohninx/frontera/BUILDS/dev/20240307/daos/install/prereq/release/fuse -D disable-mtab=True -D udevrulesdir=/work2/08126/dbohninx/frontera/BUILDS/dev/20240307/daos/install/prereq/release/fuse/udev -D utils=False --default-library both

The evidence here all points to a meson bug to me. Presumably an old one as it's only on el7 and not when using virtual environments which will be using newer software. If that's the case (and it's still happening on supported distros) then we should add a check for the meson version in our scons code.

ashleypittman avatar Mar 07 '24 17:03 ashleypittman

The evidence here all points to a meson bug to me. Presumably an old one as it's only on el7 and not when using virtual environments which will be using newer software. If that's the case (and it's still happening on supported distros) then we should add a check for the meson version in our scons code.

I'm trying to use newer meson and ninja now to see if it fixes the issue. Will report back when I know

daltonbohning avatar Mar 07 '24 17:03 daltonbohning

The evidence here all points to a meson bug to me. Presumably an old one as it's only on el7 and not when using virtual environments which will be using newer software. If that's the case (and it's still happening on supported distros) then we should add a check for the meson version in our scons code.

I'm trying to use newer meson and ninja now to see if it fixes the issue. Will report back when I know

I've gotten two successful builds now with updated meson and ninja. And @jamesanunez mentioned in a DM to me that might have fixed for him too. I'm going to try again without this patch

daltonbohning avatar Mar 07 '24 18:03 daltonbohning

Another datapoint, without this patch and updated meson + ninja I still got a failure. This time, meson explicitly says it's not a bug. Probably because it's such a common issue (and probably actually a bug)

RUN: ninja install
[0/1] Installing files.
Installing lib/libfuse3.so.3.16.2 to /usr/local/lib64
Installation failed due to insufficient permissions.
...
ERROR: Unhandled python OSError. This is probably not a Meson bug, but an issue with your build environment.

Trying some more with this patch to see if I can reproduce...

daltonbohning avatar Mar 07 '24 18:03 daltonbohning

This is 99% most definitely a bug with meson. Trying manually

$ meson configure --prefix /work2/08126/dbohninx/frontera/BUILDS/dev/20240307/daos/install/prereq/release/fuse

$ meson install
ninja: Entering directory `/work2/08126/dbohninx/frontera/BUILDS/dev/20240307/daos/build/external/release/fuse.build'
ninja: no work to do.
Installing lib/libfuse3.so.3.16.2 to /usr/local/lib64
Installation failed due to insufficient permissions.
Attempt to use /usr/bin/sudo to gain elevated privileges? [y/n]

And the configuration file shows the prefix as I have set it.

There seems to be a workaround to use destdir, which overrides the prefix. Older versions of meson:

DESTDIR=<path> ninja install

Newer versions of meson since 0.47.0. Also, use meson instead of ninja directly. (ninja still used in the backend)

meson install --destdir <path>

daltonbohning avatar Mar 07 '24 22:03 daltonbohning

The fix that works for me is #13961

daltonbohning avatar Mar 08 '24 16:03 daltonbohning

FWIW I've been able to reproduce the prefix issue on boro with EL 8.8. So I suspect master is currently doing the wrong thing for other people's builds

daltonbohning avatar Mar 08 '24 16:03 daltonbohning

Others have reported this on boro/wolf, so https://daosio.atlassian.net/browse/DAOS-15417

daltonbohning avatar Mar 12 '24 15:03 daltonbohning

Recommend closing this in favor of #13961

daltonbohning avatar Mar 15 '24 17:03 daltonbohning