pex Pex docs unclear for python scrubs

Hey there. Spoiler alert, I'm the title scrub

Part of an epic I'm on at work was to make python deployable, micro-service style. PEX fits the bill to the penny and it's currently working great. But, it relies so heavily on the project's local setup.py that I spent a few weeks debugging and triaging things that really should've been documented. Like if you bundle your project in a pex, to run -c blah.py, that blah.py must be declared in the scripts array. Or to run a web service like gunicorn, you have to bundle gunicorn in your install_requires, as gunicorn can't enter the pex and find the flask/django/whatever module to run

As part of Hacktober, I'm happy to update the docs with a "This is what pex expects from a setup" approach, or perhaps a section of information on "Integrating PEX into your CI/CD". Of course I'd reference and recommend Pants, but it was dramatic overkill for our use case, and invoking pex directly was precisely what we needed

Would a docs PR be welcome? I already have it documented in house and I promise to make it quality, just figured I'd start the discussion before going in on it

Oct 01 '18 16:10 asyncjake

Bottom line - docs are very welcome.

Below I have questions intended to suss out if you found the docs we already provide.

... project in a pex, to run -c blah.py, that blah.py must be declared in the scripts array

Hopefully you saw the docs here:

$ pex --help | grep -A4 console-script
    -c SCRIPT_NAME, --script=SCRIPT_NAME, --console-script=SCRIPT_NAME
                        Set the entry point as to the script or console_script
                        as defined by a any of the distributions in the pex.
                        For example: "pex -c fab fabric" or "pex -c mturk
                        boto".

As a python scrub though script is probably too familiar a word whereas console_script might catch your eye as different and yet both are referring to setup.py entries of the same name. I can see the confusion and higher level docs would be welcome. Take a gander here though in case you missed it and apply more-clear-making-words here:

https://github.com/pantsbuild/pex/blob/master/docs/buildingpex.rst#L209-L234

Or to run a web service like gunicorn, you have to bundle gunicorn in your install_requires, as gunicorn can't enter the pex and find the flask/django/whatever module to run

This I'm not sure I understand. Is this an instance of not zip safe?

$ pex --help | grep -A8 zip-safe
    --zip-safe, --not-zip-safe
                        Whether or not the sources in the pex file are zip
                        safe.  If they are not zip safe, they will be written
                        to disk prior to execution; Default: zip safe.
    --always-write-cache
                        Always write the internally cached distributions to
                        disk prior to invoking the pex source code.  This can
                        use less memory in RAM constrained environments.
                        [Default: False]

The long-form docs for said-same are here. IOW does applying --not-zip-safe or --always-write-cache to the pex build solve the issue?

Oct 01 '18 17:10 jsirois

Thanks for the additional info; on the first bit, that clarifies things, but getting errors like pex.pex_builder.InvalidExecutableSpecification: Could not find script 'run.py' in any distribution app 0.0.0 within PEX! is super frustrating when building your own modules into distributions or web services and you know that script is there, so I'd like to add just a line of clarity in the pex -c section for others so they know the one-step fix for it

On the second bit, describing the problem is more involved. Let's say we have a machine that will have PEX Flask microservices on them, and it's required to support more than one app instance per machine. There isn't a way to run Gunicorn/uWSGI/etc external to the PEX environments, like at the system level, and have it find and use the flask:app instances from the PEXs without some serious hacking/unzipping/nonsense. Solving that disconnect is as simple as bundling gunicorn with the PEX at build time, and having a service manager deal with starting and stopping the PEX instance. So I'd like to add a note about WSGI applications in general, as I want to save someone else the time I burned while fruitlessly researching getting an external gunicorn or uwsgi service to serve up wsgi-compatible flask apps packed in PEX.

Important note here, I'm coming at this from the "PEX is a really handy tool for dealing with python on the operations side" perspective. Having spent a bit of time figuring out setup.py assumptions, it all makes more sense, but for those coming at this from an exploratory angle, having a couple notes here to clarify those expectations could save tons of time and frustration

Oct 02 '18 23:10 asyncjake

So, full disclosure, I haven't tried adding the flags you mention to the build - but the issue seems a bit beyond how the PEX is managed on disk, I'm fully open to better solutions but Gunicorn+Flask bundles are <10MB, and so far a great production-capable solution

Oct 02 '18 23:10 asyncjake

OK, great - I'm happy to see a docs PR. On the second bit I had already assumed you were including gunicorn in the PEX but then having problems with it running from the PEX (django wsgi apps have this problem). So, even better! I had not thought about the idea of trying too wire an existing gunicorn server to a pex and it would be great to help other folks try not to do that.

Oct 03 '18 15:10 jsirois

Awesome! I've got a fork set up with the -c error note added, but I'm not sure where the best place might be for gunicorn bits. Would a new page "running pex" at the same level as buildingpex.rst be appropriate, or do you have a better location in mind?

Having been on this project for a bit, I've also got a semi-decent upstart script that could go with it, but given the controversial rise of systemd I don't know if that would be relevant to add

Oct 03 '18 16:10 asyncjake

You may want to have looks here and here before diving in to the wsgi docs, you might be able to steal useful bits. As to location - I'm really not sure. This is not really about running pex, its about what to include in a build in a narrow but important slice of pex use cases. Perhaps a recipes.rst? That leaves open adding an eclectic mix and later breaking out sub-pages if the recipes become numerous or a particular subset does and begs for its own page.

The init script does seem a bit off topic. If a command line is useful showing how to run the pexed app, that's more on target. Someone setting up daemonization should be able to take it from there.

Oct 03 '18 17:10 jsirois

Awesome. I've got the pr open at #574 - it doesn't dive into using uWSGI or other aspects of that, but I could certainly add a link to kwlzn/pyuwsgi_pex somewhere.

Oct 03 '18 20:10 asyncjake