rules_foreign_cc icon indicating copy to clipboard operation
rules_foreign_cc copied to clipboard

draft: fix: Overhaul quoting

Open gr1mpatr0n opened this issue 10 months ago • 13 comments

There are some serious issues around quoting all throughout this repo, which renders it completely ineffective for environments which may contain spaces in paths. For example, if a user has installed MSYS2 using Scoop, it is likely that the install directory will contain their user folder, which more than likely contains a space.

This is far from a comprehensive fix. There needs to be a serious audit of all quoting. I have found instances of:

  • Double (double) quoting which actually unquotes strings. See the changes to the "Environment:______" echo statement.
  • Lack of quoting in command invocations for e.g. the CMake tool.
  • Lack of quoting for directory arguments.
  • Very messy quoting which is seemingly a symptom of shell generation functions not being explicit about how they quote their arguments.

This is just the fix that we needed to get CMake builds working again on Windows.

Addresses https://github.com/bazel-contrib/rules_foreign_cc/issues/1388.

gr1mpatr0n avatar Mar 31 '25 21:03 gr1mpatr0n

Converting to draft as I believe there is a lot more work to be done here.

gr1mpatr0n avatar Mar 31 '25 21:03 gr1mpatr0n

Note: there should also really be a test case which covers this.

As a sanity check for future contributions, look at the generated build_script.shs, because currently they are in a poor state. Perhaps we can use https://www.shellcheck.net/ to ensure the scripts are sensible. The linter in my IDE is pretty unhappy with most of it.

gr1mpatr0n avatar Mar 31 '25 21:03 gr1mpatr0n

Fixing the quoting in general is very hard; particularly as there are subtle differences in what is handled in different bash versions (e.g. bash 3 on macOS vs 4 on linux) and some configure scripts want quotes in different places when it comes to quoting things like CFLAGS and the like; I actually don't believe that there is a solution to the quoting that actually works robustly for all 'foreign' build systems.

I think the way to make this more robust if we stay with the shell script approach is really to generate less shell and move the shell functions into shell library files which get sourced. The more portable method is to rewrite all the shell logic into a binary tool using rust or similar so that we're relying less on the environment of the enduser. I think for CMake we should be generating CMakePresets file and using a workflow to avoid needing to form complex CLI commandlines but my time for working on this is v. limited and so its not something that's likely to happen soon unless someone else picks this up!

jsharpe avatar Mar 31 '25 21:03 jsharpe

Fixing the quoting in general is very hard; particularly as there are subtle differences in what is handled in different bash versions (e.g. bash 3 on macOS vs 4 on linux)

Considering we mandate Bash (and on Windows, MSYS2 Bash), I personally have not come across any quoting ambiguities between platform Bash implementations, at least for trivial use-cases and as long as we are not relying on particularly new features from Bash 4.x. The standard libraries of many languages already have pretty robust {un}quote methods. I'm happy to be proven wrong, but I don't think this barrier is high enough that we can't easily veneer over it. We don't do anything particularly obtuse in scripts.

some configure scripts want quotes in different places when it comes to quoting things like CFLAGS and the like

Can you give an example of this? Wouldn't this just be handled at a project level if necessary?

shell logic into a binary tool using rust or similar so that we're relying less on the environment of the enduser

Eh, I personally think that's over-the-top. Bazel overall leans pretty heavily on having a particular environment (as I have learned much to my chagrin). And I'd rather not have to worry about the quirks of a new tool, esp. a Rust one.

This PR is really to get things "functional" on Windows for my use-case, where things are downright broken. Although it looks like I've broken some tests in CI. :)

gr1mpatr0n avatar Mar 31 '25 21:03 gr1mpatr0n

Fixing the quoting in general is very hard; particularly as there are subtle differences in what is handled in different bash versions (e.g. bash 3 on macOS vs 4 on linux)

Considering we mandate Bash (and on Windows, MSYS2 Bash), I personally have not come across any quoting ambiguities between platform Bash implementations, at least for trivial use-cases and as long as we are not relying on particularly new features from Bash 4.x. The standard libraries of many languages already have pretty robust {un}quote methods. I'm happy to be proven wrong, but I don't think this barrier is high enough that we can't easily veneer over it. We don't do anything particularly obtuse in scripts.

Yep agreed, there shouldn't be anything that stops us from being compatible; just know that in the past there have been differences between platforms that have crept in due to subtle differences, although I don't recall any specifics.

some configure scripts want quotes in different places when it comes to quoting things like CFLAGS and the like

Can you give an example of this? Wouldn't this just be handled at a project level if necessary?

This tends to come from non-autotools configure make projects; I think openssl might be the prime example but its been a little while since I've tried to tackle this so I might be misremembering. However this is more to do with the escaping of variables that get passed to downstream tools; this is slightly different to what you are addressing in this PR though.

shell logic into a binary tool using rust or similar so that we're relying less on the environment of the enduser

Eh, I personally think that's over-the-top. Bazel overall leans pretty heavily on having a particular environment (as I have learned much to my chagrin). And I'd rather not have to worry about the quirks of a new tool, esp. a Rust one.

This PR is really to get things "functional" on Windows for my use-case, where things are downright broken. Although it looks like I've broken some tests in CI. :)

Yes, I think you've added escapes around variables that are then blindly concatenated with other paths and so end up with escapes in the middle of some paths. There's also likely some expected test strings that will need updating as a result of these changes in the starlark based tests. As you said before we should probably actually write out those expected test strings to files and lint them to ensure that we're generating valid shell scripts in these tests.

I'm guessing the reason these issues haven't been picked up are partially because Bazel CI doesn't have any spaces in any of the paths it uses; I will raise this with the rules sig tomorrow as I think it'd be a useful addition to CI to be testing a common windows configuration of having a output_root that includes spaces. (Although I will note that I think a lot of windows users will use a path rooted at the root of the disk to reduce path lengths as its common to hit long path length issues on windows)

jsharpe avatar Mar 31 '25 22:03 jsharpe

I've started on #1390 which actually runs shellcheck on the generated build files, at least for the builtin tool builds which is a start; I'll try to get around to fixing the rest of the errors soon!

jsharpe avatar Apr 01 '25 11:04 jsharpe

@benjamin-branchware Can you test what went in on #1390 against the changes that you had here to see if there's anything that wasn't caught by shellcheck; I may have missed fixing up the windows versions of the commands.

jsharpe avatar Apr 03 '25 21:04 jsharpe

@jsharpe Hmm, the shellcheck integrated in my IDE still flags a lot of problems and indeed my Windows build still fails (for CMake). Here are some offending lines:

  • EXT_BUILD_ROOT=$(type -t cygpath > /dev/null && cygpath $(pwd) -m || pwd -W); export EXT_BUILD_ROOT Problem: $(pwd) needs quoting
  • symlink_to_dir $EXT_BUILD_ROOT/external/rules_foreign_cc++tools+cmake-3.23.2-windows-x86_64/bin/cmake.exe $EXT_BUILD_DEPS/bin/ False Problem: $EXT_BUILD_ROOT needs quoting.
  • cd $BUILD_TMPDIR Problem: $BUILD_TMPDIR needs quoting.
  • $EXT_BUILD_ROOT/external/rules_foreign_cc++tools+cmake-3.23.2-windows-x86_64/bin/cmake.exe --build . --config Release Problem: You guessed it! :smile:

There are quite a few others - e.g. replace_in_files calls. It looks like only rm -rf and mkdir -p have their paths quoted now.

I would have guessed these would have been flagged by your shellcheck too, no? Is it missing these because it doesn't try to generate a CMake script? (seems your changes are in core util funcs)

As an aside, is there not some standard Skylark path quoting function? It feels a little fragile dotting quotes here and there, especially when those strings are then passed to other functions.

gr1mpatr0n avatar Apr 04 '25 23:04 gr1mpatr0n

There's also other things:

Snippet Lint Error seemingly missed
local children=($($REAL_FIND -H "$1" -maxdepth 1 -mindepth 1)) Prefer mapfile or read -a to split command output (or quote to avoid splitting).
local files=($($REAL_FIND -P "$1" -type f \( -type f -and \( -name "*.pc" -or -name "*.la" -or -name "*-config" -or -name "*.mk" -or -name "*.cmake" \) \))) Prefer mapfile or read -a to split command output (or quote to avoid splitting).
local dirname=$(basename "$1") Declare and assign separately to avoid masking return values.
local argv2=$(echo "$2" | sed 's/\\/\\\\/g') (and the line below it) Declare and assign separately to avoid masking return values. AND See if you can use ${variable//search/replace} instead.

gr1mpatr0n avatar Apr 05 '25 00:04 gr1mpatr0n

In order to try to abate these issues, I moved my project sources and MSYS2 into directories without spaces. However, there is still a problem, as the bazel-bin etc. directories are symlinked to my user directory, which you guessed it has spaces in it :(

gr1mpatr0n avatar Apr 05 '25 02:04 gr1mpatr0n

There is a second issue which is blocking my testing: https://github.com/bazel-contrib/rules_foreign_cc/issues/1391.

gr1mpatr0n avatar Apr 05 '25 02:04 gr1mpatr0n

@jsharpe Any progress on this? Looks like your MR missed quite a bit.

gr1mpatr0n avatar Apr 21 '25 21:04 gr1mpatr0n

@jsharpe Any progress on this? Looks like your MR missed quite a bit.

Yes, I expected that as the shellcheck is only running on linux / macos as shellcheck isn't built for windows in the upstream repo so the windows configuration is still not being tested. Happy to review any PRs that fix up the windows commands though! I don't have a windows dev environment setup so am unable to directly fix this except via CI on buildkite which doesn't contain spaces in the paths atm so won't pick up on issues directly.

jsharpe avatar May 12 '25 22:05 jsharpe