rules_rust icon indicating copy to clipboard operation
rules_rust copied to clipboard

Unable to build pq-sys 7.0.0

Open erenon opened this issue 11 months ago • 15 comments

I think the cause is:

  • build.rs embeds a cargo envvar: PathBuf::from(env!("CARGO_MANIFEST_DIR")).
  • when the buildscript actually runs, the embedded envvar points to the previous sandbox (the one of the compilation), that is not present any more.

(see build.rs in this commit)

I verified this by adding some eprintln!s into build.rs:

CARGO_MANIFEST_DIR: .../sandbox/linux-sandbox/14/execroot/_main/external/rules_rust~~crate~crates__pq-sys-0.7.0
source_path: ".../sandbox/linux-sandbox/14/execroot/_main/external/rules_rust~~crate~crates__pq-sys-0.7.0/src/bindings_linux.rs"
out_path: ".../sandbox/linux-sandbox/15/execroot/_main/bazel-out/k8-fastbuild/bin/external/rules_rust~~crate~crates__pq-sys-0.7.0/_bs.out_dir/bindings.rs"

There's one thing I do not understand. aquery of the builds script compilation shows:

$ bazel aquery @@rules_rust~~crate~crates__pq-sys-0.7.0//:_bs_
action 'Compiling Rust bin _bs_ (8 files) [for tool]'
  Mnemonic: Rustc
  Target: @@rules_rust~~crate~crates__pq-sys-0.7.0//:_bs_
[...]
CARGO_MANIFEST_DIR=${pwd}/external/rules_rust~~crate~crates__pq-sys-0.7.0

What replaces ${pwd} with the actual absolute path? There's such a transformation in cargo/cargo_build_script_runner/bin.rs , but that is for the envvars passed to the buildscript runtime.

This issue is reproducible locally with these files:

# MODULE.bazel
bazel_dep(name = "rules_rust", version = "0.58.0")

crate = use_extension("@rules_rust//crate_universe:extension.bzl", "crate")
crate.spec(package = "pq-sys", version = "=0.7.0", features = [], default_features = False)

crate.from_specs(
  isolated = False,
  supported_platform_triples = [
    "x86_64-unknown-linux-gnu",
  ],
)
use_repo(crate, "crates")

BUILD.bazel is empty. .bazelversion is "8.1.1".

Then:

bazel build @crates//:pq-sys

Produces:

thread 'main' panicked at external/rules_rust++crate+crates__pq-sys-0.7.0/build.rs:118:46:
Couldn't write bindings: Os { code: 2, kind: NotFound, message: "No such file or directory" }

Same with --remote_download_all. No remote exec or any other special config is needed.

erenon avatar Mar 21 '25 17:03 erenon

I would recommend a patch to upstream that updates from using env! to std::env::var_os - while these are the same in cargo, they aren't necessarily in Bazel.

illicitonion avatar Mar 22 '25 22:03 illicitonion

Can you say a bit more? Is that because it's not possible with the the way Bazel or rules_rust works to fix it in a Cargo-compatible way? It's certainly worth a shot and better than nothing, but on the other hand it's a lot easier to reason about rules_rust if it is compatible with Cargo.

https://github.com/search?q=env%21%28%22CARGO_MANIFEST_DIR%22%29+path%3A**%2Fbuild.rs&type=code

aran avatar Mar 23 '25 04:03 aran

Is that because it's not possible with the the way Bazel or rules_rust works to fix it in a Cargo-compatible way?

The root cause of the problem here is sandboxing. You could work around this by disabling sandboxing in Bazel builds if you wanted. If you build with --spawn_strategy=local I imagine things will work, at the expense of sandboxing. --sandbox_debug would probably also work.

Each compile we do, and each time a build script is run, is run in a separate sandbox directory. This is useful for avoiding untracked dependencies on files (which means our caching is really good), but means that anything that assumes files exist at exactly the same absolute path between actions will be broken.

We do go to some lengths to make things work, e.g. when a build script generates linker flags we normalise out the exec root in any absolute paths in flags (and then re-replace them back to the linker action exec root path). But what using env!("CARGO_MANIFEST_DIR") does is just embed an absolute path raw in the produced binary, and we really don't want to be in the business of replacing strings in the binary.

We could in theory try to build and run the build script binary in one action so they'd share a sandbox, but this would impact our ability to do cross-compilation (some people build and run build scripts on different platforms).

Technically std::env::var{,_os} is the slightly more correct thing to use, because where these files are is a runtime concern not a build-time concern (and also makes the generated binaries more cacheable and portable), but it's also not a thing just-cargo users need to worry about so it doesn't super matter to most people.

I've sent PRs for a few similar issues in the past and never gotten push-back, but I agree it's annoying to have to do, and a bit of a gap in our compatibility story.

illicitonion avatar Mar 23 '25 10:03 illicitonion

FMI, why does rules_rust set CARGO_MANIFEST_DIR to an absolute path? (Instead of a relative path, that'd coincide with the actual path in the next sandbox)

erenon avatar Mar 23 '25 12:03 erenon

https://github.com/sgrif/pq-sys/pull/81

aran avatar Mar 24 '25 18:03 aran

The fix is merged for pq-sys, so the specific instance of this issue for that crate should fixed when they publish their next release. Per the query above in https://github.com/bazelbuild/rules_rust/issues/3369#issuecomment-2746018106 there are hundreds of crates that will have the same root cause issue. Not sure if there should be a new long-term issue to track that, a re-titling of this one, some kind of documentation note somewhere in rules_rust about known limitations to cargo compatibility, or something else.

aran avatar Mar 26 '25 18:03 aran

In case it helps folks, there's also Postgres libraries available in the Bazel Central Registry. If the issue is running the build script to compile it, the targets here could be utilized to avoid that. Only Postgres 16 is available where pq-sys uses Postgres 17 and some new build files would need to be published.

UebelAndre avatar Mar 26 '25 21:03 UebelAndre

In case it helps folks, there's also Postgres libraries available in the Bazel Central Registry. If the issue is running the build script to compile it, the targets here could be utilized to avoid that. Only Postgres 16 is available where pq-sys uses Postgres 17 and some new build files would need to be published.

Looks like it won't be maintained: https://github.com/bazelbuild/bazel-central-registry/issues/4152#issuecomment-2755913012

aran avatar Mar 26 '25 23:03 aran

I think the response was just saying they won’t be driving any development. I would still use that package if it provides a hermetic means of compiling Postgres. I’ve certainly contributed to other modules in a similar state in BCR and I think the experience was good. If you want arm Linux support why not submit a PR for a bcr.1 patch?

UebelAndre avatar Mar 26 '25 23:03 UebelAndre

@aran https://github.com/bazelbuild/bazel-central-registry/pull/4110 I think is a good example of an experience I enjoy and would expect from BCR. I wanted a new version of nasm so modified the existing module for the new version until it worked. If there are any tests then it's great to have some kind of bar to hit when making these tests. The person labeled as a maintainer will probably pop in to click approve if they see all tests passing, but I wouldn't have expected anyone to fulfill the request for a new version (which is why I volunteered my time).

UebelAndre avatar Mar 26 '25 23:03 UebelAndre

If the issue is running the build script to compile it, the targets here could be utilized to avoid that.

The problem with this particular package was not with compiling the underlying library. But the build script was copying around bindings which are target specific and that relied on the env! macro.

@illicitonion or @UebelAndre do you have any comment on the below? This potentially could lift a lot of these compile time macros. As we would be able to manipulate the directory structure at least.

FMI, why does rules_rust set CARGO_MANIFEST_DIR to an absolute path? (Instead of a relative path, that'd coincide with the actual path in the next sandbox)

havasd avatar Mar 27 '25 10:03 havasd

Is CARGO_MANIFEST_DIR not resolved to an absolute path when the build script is running? https://github.com/bazelbuild/rules_rust/blob/cc0b6172885debb6a12697b74dbf86c0137d87bc/cargo/cargo_build_script_runner/bin.rs#L28-L36

https://github.com/bazelbuild/rules_rust/blob/cc0b6172885debb6a12697b74dbf86c0137d87bc/cargo/cargo_build_script_runner/bin.rs#L97

UebelAndre avatar Mar 27 '25 16:03 UebelAndre

It is. But why? It makes the env! embedding broken. Can't we leave it relative?

erenon avatar Mar 27 '25 17:03 erenon

It is. But why? It makes the env! embedding broken. Can't we leave it relative?

I can’t think of a reason why not. It was probably made absolute to match cargo or solve for some issue. If making it relative solves a bigger issue without breaking anything then that sounds good to me 😃

UebelAndre avatar Mar 27 '25 17:03 UebelAndre

FMI, why does rules_rust set CARGO_MANIFEST_DIR to an absolute path? (Instead of a relative path, that'd coincide with the actual path in the next sandbox)

A couple of reasons:

One is that this is what cargo does, and there are probably things that assume that too - particularly when setting this in real libraries or binaries rather than cargo build scripts, where setting it to a relative path would make the binary non-portable.

But the bigger one is... Relative to what?

  • In a Bazel world, almost everything runs in the exec root. So I guess relative to the exec root? If you run a binary that's been built, that's the directory it will get run from (assuming you're not moving the binary).
  • But in a Cargo world, almost everything runs in the $CARGO_MANIFEST_DIR itself. If a crate is trying to find files, it probably means for them to be relative to this directory.
  • In rules_rust, we default cargo_build_scripts to run in $CARGO_MANIFEST_DIR, which means I guess we'd hard-code it to .? But the directory we run in is overridable using rundir because e.g. some folks' cc_toolchains only work when run from the exec root, as is the standard Bazel assumption. Should we set $CARGO_MANIFEST_DIR differently depending on what rundir you're setting? Should we set it to one thing when building and a different thing when running? That's likely to break different things...
  • In rules_rust we run rust_test targets from the exec_root. There are probably crates that assume this value is set the same when building a cargo_build_script, when running the cargo_build_script, when compiling the test, and when running the test...

I'm not saying this wouldn't improve some crates' situation - but I do think it's a bit more complicated than "relative paths are always better". It's not obvious to me that it would solve more problems than it would introduce, but it would probably make it harder to understand what's going on when something goes wrong...

Per the query above in #3369 (comment) there are hundreds of crates that will have the same root cause issue.

While I agree this is an issue, it's worth noting that only a subset of these actually cause problems. The problematic usage pattern is grabbing a path at compile time, and then trying to do something with it at runtime. rules_rust does correctly handle cases where files are getting embedded at compile time (e.g. using include_bytes! such as https://github.com/agbrs/agb/blob/b5df46cf1b50b50e75c8b58bc4a0883bf191c8ee/agb/build.rs#L8) or where the path is being used to to output instructions to downstream actions via cargo metadata output (such as https://github.com/zerotier/DesktopUI/blob/c12ddd93a13489f574f9a7fa040ade8b57493a22/build.rs#L4). I can't speak to the relative frequencies of these in general, but in my experience problematic cases are reasonably rare, and ~trivially fixed upstream when they've come up. Whack-a-mole isn't fun, but if you only need to whack one mole a year...

(There's also a question of how many of these crates anyone will ever actually want to build with Bazel)

Not sure if there should be a new long-term issue to track that, a re-titling of this one, some kind of documentation note somewhere in rules_rust about known limitations to cargo compatibility, or something else.

I'd be very happy to review a documentation PR about this limitation, though I'm not sure where the best place for that to go would be.

illicitonion avatar Mar 29 '25 00:03 illicitonion