rules_python icon indicating copy to clipboard operation
rules_python copied to clipboard

feat!: always generating py_library

Open linzhp opened this issue 7 months ago • 8 comments

This PR makes sure that all non-test Python modules are covered by a py_library target. When a module is imported by another target, Gazelle no longer resolves it to py_binary and only resolves to py_library, which is consistent with other languages.

Note that in file mode, this means there would be duplicate py_binary and py_library, with the same srcs and deps (see changed test cases). Because these attributes are maintained by Gazelle, there is no additional maintenance cost from the users. To reduce the duplication, we will need to change rules_python to add an embeds attribute to py_binary, so it can inherit all srcs and deps from py_library, similar to the embeds attribute of Go rules.

In addition, this PR renames the py_binary target of foo.py to foo_bin and use foo for the py_library, to be consistent with the current naming convention in package mode. However, this requires users to remove or rename their existing py_binary rules during the migration.

This is an alternative implementation of #2822. cc @yushan26

linzhp avatar Jun 12 '25 16:06 linzhp

@aignas Can you take a look? The failing test is due to an infra issue.

linzhp avatar Jun 12 '25 17:06 linzhp

I'm not sure how I feel about this. These are my concerns, in no particular order:

  • This will be a pretty massive change for file generation mode users, as all of the py_binary targets will (a) be split into two targets and (b) get renamed from foo to foo_bin.
    • You already mention (b) this in the PR, but I want to reiterate the point. What about scripts that call the already-existing foo binary target? Those will have to be updated, creating a nontrivial maintenance burden.
  • Because these attributes are maintained by Gazelle, there is no additional maintenance cost from the users.

    • But now users may have to duplicate things a like data, visibility, tags, and many other non-Gazelle-controlled attributes across both targets.
  • What if users want the py_library to be foo_lib and the binary to be just foo? Do you plan on adding support for such?
  • Having multiple targets (py_library + py_binary) with the same srcs/deps feels off to me.
    • If anything, maybe the py_binary could just be
      py_binary(
          name = "foo_bin",
          main = ":foo",
      )
      
      but this doesn't address my other concerns.

This feels like a 2.0 change to me, if it gets added.

dougthor42 avatar Jun 13 '25 04:06 dougthor42

Ah, looks like @aignas we share at least some thoughts, if not the same conclusion :upside_down_face: .

While buildozer commands might be useful for an initial migration, I don't think such a thing is really suitable for long-term use. Requiring users to set up buildozer to sync target attributes from library to binary at every change seems like undue burden. We'd basically be asking everyone to set up CI for their project that runs "buildozer copy attributes from A to B"

dougthor42 avatar Jun 13 '25 04:06 dougthor42

What if users want the py_library to be foo_lib and the binary to be just foo? Do you plan on adding support for such?

Maybe can take this approach instead. This would make the migration easier.

linzhp avatar Jun 13 '25 05:06 linzhp

Re: same src in multiple files

The technical constraint is it'll interact poorly with precompiling.

In general, though, yes, the same source being in multiple targets is an anti pattern and prone to causing issues.

I think what makes more sense is to allow a binary to get it's main source from another target directly. Eg allow a library to be the target of main, so that the binary doesn't try to compile twice and the 1:1 is preserved

rickeylev avatar Jun 13 '25 16:06 rickeylev

Perhaps we should move this to a Discussion or write up a design doc to hash things out. I'm still not sure I fully understand the issue that it's solving, but to be fair I almost exclusively use file generation mode so maybe I'm simply not running into the issue(s).

dougthor42 avatar Jun 14 '25 21:06 dougthor42

I'm still not sure I fully understand the issue that it's solving, but to be fair I almost exclusively use file generation mode so maybe I'm simply not running into the issue(s).

It's solving the same issue as #2822: the same Python file can exist both in a py_library and py_binary at the same time in package mode and project mode, causing Gazelle not being able to resolve the import.

linzhp avatar Jun 15 '25 03:06 linzhp

Did some archeology and found the discussion in https://github.com/bazel-contrib/rules_python/pull/1664. I agree with @rickeylev in that discussion that:

A binary shouldn't be used as a library, insofar as Bazel dependencies are concerned

However, it's too late to fix it due to migration cost discussed in this thread. So I put out a lightweight fix to it in #2989 instead. Please take a look.

linzhp avatar Jun 15 '25 04:06 linzhp