Rules Apple Fails on Intel x86 Mac when using CpuInfo (Tensorflow)
Problem
If you are attempting to use any rule from @build_bazel_rules_apple on an Intel x86_64 Mac with anything that relies on the cpuinfo project (like Tensorflow) you will face the build error described by this mediapipe issue and this tensorflow issue.
Root Cause
The root cause of the problem falls with the complex interplay between cpuinfo, tensorflow, bazel, and the @build_bazel_rules_apple library. Bazel has reverted the default cpu value on x86 macOS to "darwin" in this commit.
This means that bazel expects Intel x86_64 Macs to report their cpu as darwin. @build_bazel_rules_apple will report the cpu as darwin_x86_64. This cpu does not exist in the cpuinfo project as it expects darwin. Although there is an unmerged pull request to fix this behavior, it is unlikely to be resolved soon.
Solution
In order to fix the problem, I have created a patch so that the library can read the host_cpu value which will report as darwin on an Intel x86_64 Mac. If this value is present, it will return darwin instead of darwin_x86_64.
The Fix
It is necessary to add the --incompatible_enable_apple_toolchain_resolution flag to your build command and modify@build_bazel_rules_apple with the following patch:
diff --git a/apple/internal/transition_support.bzl b/apple/internal/transition_support.bzl
index 65f51b89..7da7ade3 100644
--- a/apple/internal/transition_support.bzl
+++ b/apple/internal/transition_support.bzl
@@ -102,6 +102,9 @@ def _cpu_string(*, cpu, platform_type, settings = {}):
return "ios_sim_arm64"
return "ios_x86_64"
if platform_type == "macos":
+ host_cpu = settings["//command_line_option:host_cpu"]
+ if host_cpu == "darwin":
+ return "darwin"
if cpu:
return "darwin_{}".format(cpu)
macos_cpus = settings["//command_line_option:macos_cpus"]
@@ -342,6 +345,7 @@ _apple_rule_common_transition_inputs = [
"//command_line_option:apple_crosstool_top",
]
_apple_rule_base_transition_inputs = _apple_rule_common_transition_inputs + [
+ "//command_line_option:host_cpu",
"//command_line_option:cpu",
"//command_line_option:ios_multi_cpus",
"//command_line_option:macos_cpus",
does https://github.com/pytorch/cpuinfo/pull/139 fix this issue as well? I see your mention of it but does it work? maybe we could poke some folks to get it reviewed if so
Yes that PR would fix the problem for anyone who will update to the version of tensorflow which will eventually contain the cpuinfo fix when it is merged. The problem would still exist retroactively for anyone using older versions of tensorflow (which would be on the older versions of cpuinfo that do not have that fix).
I have written a post which shows both ways to patch the problem. The harder way is using the same technique that the cpuinfo MR uses and replicating it across all 3 places cpuinfo is referenced. This creates a double nested patch for tensorflow that consists of 3 patches itself and is quite cumbersome to maintain.
FYI darwin as a CPU has been removed again on 7.x here https://github.com/bazelbuild/bazel/pull/17547
what change broke this?
As noted by your comment, there are still 3 projects with open PRs that rely on the darwin means x86_64 convention. Additionally ruy is not listed but is another dependency of tensorflow that does.
It will take a long time for the change from darwin to darwin_x86_64 to percolate downstream and for people to adopt it. The original issue which manifested as a symptom of this problem has been open for 2+ years without traction or resolution from the bazel, mediapipe, or tensorflow teams.
It may be okay to force the change from darwin to darwin_x86_64 for Bazel 7.x and forward but this library should not be broken for people who choose to use tensorflow. The point of my PR is that if a system (for me Bazel 6.1) self reports as darwin, and not darwin_x86_64, then don't attempt to override it because it breaks downstream projects (tensorflow, ruy, xnnpack, etc.) in the pipeline.
thanks for linking that other library, i submitted a fix there too https://github.com/google/ruy/pull/331
im fine with merging your change, but i do think it would be worth trying to push the other real solutions at the same time, if nothing else it will block any future bazel upgrades until it's solved.
MediaPipe is also affected by this: https://github.com/google/mediapipe/issues/2324