pants icon indicating copy to clipboard operation
pants copied to clipboard

Port `scie` plugin to mainline

Open sureshjoshi opened this issue 1 year ago • 2 comments

I've got a plugin that I've been using for about a year to package pexes into scie files using science.

I didn't mainline it because John mentioned that eventually pex will gain this functionality natively, so it seemed silly to port this in.

Today I realized, it's sillier to NOT port this in and then just eventually deprecate it in favour of the native tooling later. These are incredibly useful to me for non-docker deployments, and just making tooling easier on my machine (e.g. https://github.com/sureshjoshi/pantsible).

https://github.com/sureshjoshi/pants-plugins/tree/main/pants-plugins/experimental/scie https://github.com/a-scie/jump https://github.com/a-scie/science

sureshjoshi avatar Jun 23 '24 13:06 sureshjoshi

Sounds like this might be coming sooner: https://github.com/pex-tool/pex/issues/2096#issuecomment-2229513390

sureshjoshi avatar Jul 15 '24 22:07 sureshjoshi

I am able to generate a scie pex by using the following configurations note: the output in dist will not have the .pex extension

# BUILD
pex_binary(
    name="__main__",
    entry_point="__main__.py",
    extra_build_args=["--scie=eager"], # or lazy
    output_path="${spec_path_normalized}/${target_name_normalized}", # this template needs pants 2.23, on pants 2.22 have to write the file name without the .pex extension
    sh_boot=True,
    execution_mode="venv",
)

#pants.toml
[pex-cli]
version = "v2.20.1" # just to use the latest pex version, not sure the minimum needed
known_versions = [
  "v2.20.1|linux_x86_64|0dfc0295677698ad1c74c77e5f5b2ff19a0cba1ddd629b228aaf5d2bd290bc20|4313097",
]

shanipribadi avatar Sep 24 '24 08:09 shanipribadi

@shanipribadi this actually doesn't work. The pex you get doesn't have python baked in. It's a zipapp with a python shebang on the entrypoint.

rhuanbarreto avatar Mar 05 '25 17:03 rhuanbarreto

@shanipribadi what @rhuanbarreto says is true, you need Pex 2.28.1 or newer. My test rig just uses 2 files: pants.toml and BUILD:

  • pants.toml:

    [GLOBAL]
    pants_version = "2.24.2"
    
    backend_packages = [
      "pants.backend.python"
    ]
    
    [python]
    interpreter_constraints = ["==3.11.*"]
    
    [pex-cli]
    known_versions.add = [
      # Claimed working but not.
      "v2.20.1|linux_x86_64|0dfc0295677698ad1c74c77e5f5b2ff19a0cba1ddd629b228aaf5d2bd290bc20|4313097",
      # Last bad version:
      "v2.28.0|linux_x86_64|3ed2efa4dc9e63129d5ad018d34101613dac5d6fb03ff29cd970a075fac0a212|4366923",
      # 1st good version:
      "v2.28.1|linux_x86_64|3ba098a452db069f2524121999c784c73f0e01c8068d68133c4db83ac9c3ca5b|4366937",
      # Latest version:
      "v2.33.2|linux_x86_64|b6c035db294d9c84d72ec723c8bb31c9640e3a8603dbc48233352c1db1ccf5e8|4588594",
    ]
    
  • BUILD:

    python_requirement(name="cowsay.whl", requirements=["cowsay"])
    
    pex_binary(
        name="cowsay",
        script="cowsay",
        dependencies=[":cowsay.whl"],
        extra_build_args=["--scie=eager"],
        output_path="${spec_path_normalized}/${target_name_normalized}",
        sh_boot=True,
        execution_mode="venv",
    )
    

And the experiment shell session:

# Using Pex 2.20.1:
:; pants --pex-cli-version=v2.20.1 package :cowsay
08:15:51.83 [INFO] Wrote dist/cowsay

# Oops! The PEX is still just a PEX with a `#!/bin/sh` header (`--sh-boot`):
:; file dist/cowsay
dist/cowsay: POSIX shell script executable (binary data)

:; head -1 dist/cowsay
#!/bin/sh


# Using Pex 2.28.0:
:; pants --pex-cli-version=v2.28.0 package :cowsay
08:16:10.80 [INFO] Canceled: Building cowsay with 1 requirement: cowsay
08:16:10.80 [INFO] Wrote dist/cowsay

:; file dist/cowsay
dist/cowsay: POSIX shell script executable (binary data)


# Using Pex 2.28.1:
:; pants --pex-cli-version=v2.28.1 package :cowsay
08:16:16.62 [INFO] Canceled: Building cowsay with 1 requirement: cowsay
08:16:16.63 [INFO] Wrote dist/cowsay

:; file dist/cowsay
dist/cowsay: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), static-pie linked, stripped

:; head -1 dist/cowsay
ELF>6�@��8@�8�8� ����8��8  �; �;��++;�+; P�td```�U�UQ�tdR�td����8����+@��+�+��+_��+���+��+�,�,�; ,u(,�~@,A�H,��`,�h,;
�,�N�,�N�,�N�,Q�,p��,�Q-�< -H�(-H�0-H�8-H�@-H�H-H�P-H�X-x-;`-H�h-H�p--;x-�                                           ��,�N�,�N�,`O�,�T�,�S�,�N�,P�,@P�,�
                                                                          <�-��;�-:�-��-���-��-�.E� .Y6


# Using Pex 2.33.2 (latest):
:; pants --pex-cli-version=v2.33.2 package :cowsay
08:16:30.30 [INFO] Canceled: Building cowsay with 1 requirement: cowsay
08:16:30.31 [INFO] Wrote dist/cowsay

:; file dist/cowsay
dist/cowsay: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), static-pie linked, stripped

[!NOTE] @shanipribadi you use sh_boot=True which is helpful for both a traditional PEX zip and a --venv PEX but it is not helpful for a scie. The #!/bin/sh header is a speed hack that uses /bin/sh to locate an appropriate Python interpreter and possibly skip execution of the PEX file altogether by directly executing its pre-extracted layout under the PEX_ROOT on runs 2+. This takes ~50ms of startup overhead imposed by Python to ~1ms. For a scie, there is no shebang - the tip of the scie is a native executable and it runs even faster than /bin/sh doing the same job the #!/bin/sh header did, but finding the appropriate Python interpreter already embedded in the scie.

jsirois avatar Mar 06 '25 16:03 jsirois

@rhuanbarreto this is not completely true:

It's a zipapp with a python shebang on the entrypoint.

It's a zipapp with a #!/bin/sh shebang. I have a note above about why --sh-boot is not useful here unlike with traditional PEXes where it is useful (You almost always want to use it for traditional PEXes, whether --venv or not.).

jsirois avatar Mar 06 '25 16:03 jsirois

I think in the Slack, we talked about this - and the thing we're not doing is materializing the correct files out (today). So, we should just emit all the in the output directory (or specify them, whatever - point is we went from a single pex, to maybe a pex + a scie, and Pants doesn't handle that)

sureshjoshi avatar Mar 06 '25 19:03 sureshjoshi

and the thing we're not doing is materializing the correct files out (today)

Well, you are iff the user is curious enough or knowledgable enough to use pex_binary(output_path="anything_with_no'.pex'_extension_at_the_end"). This is demonstrated above and also in @rhuanbarreto's issue: https://github.com/pantsbuild/pants/issues/22044

jsirois avatar Mar 06 '25 19:03 jsirois

Lol, yes, I agree - but I personally like emitting both, depending on how they get used - and I do also think it should "just work" as much as possible.

sureshjoshi avatar Mar 06 '25 19:03 sureshjoshi

Well, I am not so LOL about all this. Pants seems to envelop users and developers in a cloud of not understanding anything or trying to very hard. I agree works out of the box is where things should be, but also, damn, poke around a bit!

jsirois avatar Mar 06 '25 19:03 jsirois

Oh, right, and continuing with the above test rig:

# Clean slate:
:; rm -rf dist/

:; pants --pex-cli-version=v2.28.1 package :cowsay
17:08:43.15 [INFO] Wrote dist/cowsay

:; ls -1sh dist/
total 32M
32M cowsay


# A reminder that scies have built-in utilities:
:; SCIE=help dist/cowsay 
For SCIE=<boot_command> you can select from the following:

boot-pack
    (-sj|--jump|--scie-jump [PATH])
    (-1|--single-lift-line|--no-single-lift-line)
    [lift manifest]*

    Pack the given lift manifests into scie executables. If no manifests
    are given, looks for `lift.json` in the current directory. By
    default the current scie-jump is used as the scie tip, but an
    alternate scie-jump binary can be specified using --scie-jump. By
    default the lift manifest is appended to the tail of the scie as a
    single line JSON document, but can be made a multi-line
    pretty-printed JSON document by passing --no-single-lift-line.

help: Display this help message.

inspect: Pretty-print this scie's lift manifest to stdout.

install (-s|--symlink) [dest dir]*

    Install all the commands in this scie to each dest dir given. If no
    dest dirs are given, installs them in the current directory.

list: List the names of the commands contained in this scie.

split [directory]?

    Split this scie into its component files in the given directory or
    else the current directory if no argument is given.


# Let's split out the PEX file?:
:; SCIE=split dist/cowsay dist

:; ls -1sh dist/
total 63M
4.0K configure-binding.py
 32M cowsay
 29M cpython-3.11.11+20250212-x86_64-unknown-linux-gnu-install_only.tar.gz
4.0K lift.json
756K pex
1.8M scie-jump

# Is that pex our cowsay scie PEX? Looks like it save for the header on the scie:
:; diff -u <(unzip -l dist/cowsay 2>&1) <(unzip -l dist/pex 2>&1)
--- /dev/fd/63	2025-03-06 17:23:26.810326438 -0800
+++ /dev/fd/62	2025-03-06 17:23:26.810326438 -0800
@@ -1,6 +1,4 @@
-Archive:  dist/cowsay
-warning [dist/cowsay]:  31953601 extra bytes at beginning or within zipfile
-  (attempting to process anyway)
+Archive:  dist/pex
   Length      Date    Time    Name
 ---------  ---------- -----   ----
         0  1980-01-01 00:00   .bootstrap/

# OK. Does that actually work?:
:; file dist/pex 
dist/pex: POSIX shell script executable (binary data)

:; dist/pex -t Moo!
bash: dist/pex: Permission denied

# Right, the executable bit is not set. Well, that shouldn't stop us!:
:; sh dist/pex -t Moo!
  ____
| Moo! |
  ====
    \
     \
       ^__^
       (oo)\_______
       (__)\       )\/\
           ||----w |
           ||     ||

:; python dist/pex -t Moo!
  ____
| Moo! |
  ====
    \
     \
       ^__^
       (oo)\_______
       (__)\       )\/\
           ||----w |
           ||     ||


# So both /bin/sh and python can run a `--sh-boot` PEX as expected.
# But we can restore a traditional PEX too:
:; chmod +x dist/pex

:; dist/pex -t Moo!
  ____
| Moo! |
  ====
    \
     \
       ^__^
       (oo)\_______
       (__)\       )\/\
           ||----w |
           ||     ||

So @sureshjoshi everything you want without hacking up Pants even more, or, if you prefer, without relying on Pants to do everything.

jsirois avatar Mar 07 '25 01:03 jsirois

OK, https://github.com/a-scie/jump/pull/283 will allow the more useful / less messy:

SCIE=split dist/cowsay dist -- pex

That will just split out the pex. Back on the Pex side I'll also follow up with a change to mark the PEX embedded in the scie as executable and provide an alternate key that matches the output file name; so you could split either the fixed pex or, say cowsay.pex.

jsirois avatar Mar 09 '25 19:03 jsirois

Alright, support for splitting a PEX scie's PEX out individually and that PEX having the expected names and perms is now all available in https://github.com/pex-tool/pex/releases/tag/v2.33.3.

jsirois avatar Mar 12 '25 05:03 jsirois