Fails if dependencies include .pth files
Context
As a Mu Editor contributor, and having worked a bit on improving its packaging, I joined the sprint today in order to understand how easy/difficult it will be for Mu to be packaged with the newer briefcase 0.3 release.
Facts about Mu packaging:
- macOS Application Bundle produced with
briefcase0.2.x. - Windows installer produced with
pynsist. - PyPI wheel produced with
wheel+setuptools. - Also included in Debian, Ubuntu, and Raspbian package repositories. Maybe Fedora, too. Not 100% sure because Linux distribution packaging has been, AFAICT, handled outside of the development tree (or maybe it hasn't had much love, in the last 12 months or so).
Challenges about Mu packaging:
- Having a single source of truth WRT to dependency specification and meta data.
- The current solution is
setuptoolsbased and all of the information sourced fromsetup.py+setup.cfg(from there, we use a non-trivial script to produce the necessarypynsistinput such that it can do its thing, on Windows).
The Issue
Packaging Mu on Windows leads to a partially ok Mu installation for various motives. In this issue, I'm focusing on a failure that results when trying to bring up its Python REPL -- it leads to a crash (Mu's fault) because a particular module fails to import, resulting in an unhandled exception.
Specifics:
- Mu's REPL is based on
qtconsolethat, on Windows, ends up requiringpywin32. -
pywin32uses.pthfiles to guidesite.pyin populatingsys.pathin a specific way.
Bottom line:
- Briefcase packaged Windows applications that depend on
pywin32fail to importwin32api, on of its modules.
Investigation
After a few hints from @freakboy3742 and @dgelessus at Gitter, here's what's going on:
- The Python support package being used is the default one: one of the "embeddable zip file" packages from https://www.python.org/downloads/windows/. (FTR, all if this was explored with Python 3.6.8 64 bit).
- In particular it includes a
pythonXX._pththat:- Makes Python run in isolated mode...
- ...and prevents site.py from being loaded...
- ...per these docs here.
- Such
pythonXX._pthfile is actually overwritten bybriefcasein order to:- Add the
src\appandsrc\app_packagestosys.pathsuch that both the application and its dependencies can be successfully imported.
- Add the
- However, the presence of the
._pthfile:- Prevents the
sitemodule from being loaded at startup... - ...which would be responsible for populating
sys.pathfrom any.pthfiles that are found on thesite-packagesdirectories.
- Prevents the
A Successful Hack
With this information, I invested some time fiddling with these things to see if, at least, I could make it work. Here's what I did that resulted in a working Mu REPL (thus, its underlying import win32api working!):
-
Hacked the cookiecutter template's
briefcase.tomlfile (took me quite figuring out where this file was coming from!!!):- Set
app_packages_pathtosrc\python\lib\site-packages, instead.
- Set
-
Then, hacked
briefcase's overwriting ofpythonXX._pthto produce:pythonXX.zip . lib\site-packages import site ..\app- This lets Python find the Standard Library with the first 2 lines...
- ...find application dependencies with the 3rd line...
- ...has
siteimported such that.pthfiles in the application dependencies are handled... - ...lastly adds the application package path such that it can be imported and run.
-
Lastly, I observed that having
siteimported lead to an over-populated, non-safesys.path. For some reason, my local Python installation'ssite-packageswas being added, and then maybe some more. -
With that, the last step of the hack, was creating a
sitecustomize.py, which is automatically picked up whensiteis imported per the revampedpythonXX._pth. Here's my take:import re import sys _IS_LOCAL = re.compile(r'(src\\python|src\\app)', re.IGNORECASE) sys.path = [path for path in sys.path if _IS_LOCAL.search(path)]
With these three changes, the import win32api in Mu succeeds and, ultimately Mu works as intended WRT to providing a Python REPL.
Thoughts
- Handling
.pthfiles in dependencies is a must, I'd venture saying. - My approach is the result of fiddling and exploring and I don't like it very much (but it works!). It feels unnecessarily complicated and, thus, brittle.
- Somewhat orthogonal, but related, maybe having a
venvto which dependencies arepip installed instead ofpip install --targeted will be more robust, at least for the platforms where that is feasible. No worries about playing with import PATHs in three distinct places (well, maybe some PATH cleaning up could be in order, to guarantee isolation -- see mysitecustomize.py, above).
All in all, I leave this issue here in the hope that some useful discussion comes out of it, while serving as a hack-guide to someone facing a similar failures. I'm not sure yet about what could be a solid, sustainable, simple, future-proof solution.
Shall we discuss? :)
PS: I suppose similar failures will happen on any platform, as long as .pth files are brought in by dependencies.
NOTE: The sitecustomize.py I pasted in the previous comment is too aggressive -- works with briefcase run but apparently fails after briefcase package MSI installation. Realized that after the fact. Will come back a paste something that works and excludes non application/dependecy/bundled standard library paths. Needs investigation. :)
Thanks for the thorough investigation and writeup!
For background: app_packages was introduced because it allowed us to isolate the support package from the dependencies. That's not a huge concern for "normal" Python installs because the Python interpreter is installed once and packages are installed into that interpreter (or a virtual environment); but in an app packaging world, updating the support package is something you're more likely to want to do independent of dependencies. I'm not fundamentally opposed to breaking this separation and using site_packages; but it's worth being aware what the consequence of that decision would be.
That said - the remaining fixes all seem like (a) a good set of changes, and (b) not fundamentally incompatible with using a separate app_packages folder - we just need to add app_packages to the python3.X._pth file.
The fact that your local Python's path is being added to sys.path is definitely odd - and definitely something we want to avoid; the site path filtering seems like an interesting approach, although I guess the real fix is to work out why the extra path elements are leaking into sys.path in the first place.
Ideally, these changes wouldn't be something baked into the Briefcase sources either - they'd be something included in the briefcase template so that an end-user can easily customize the contents.
Thanks for the feedback.
That said - the remaining fixes all seem like (a) a good set of changes, and (b) not fundamentally incompatible with using a separate
app_packagesfolder - we just need to addapp_packagesto thepython3.X._pthfile.
It's already there, in the current master! But that isn't enough, apparently. My understanding:
- The
._pthfile preventssitefrom being imported (unless it explicitly imports it). - But
siteruns the code that processes the regular.pthfiles insite-packages.
The fact that your local Python's path is being added to sys.path is definitely odd - and definitely something we want to avoid; the site path filtering seems like an interesting approach, although I guess the real fix is to work out why the extra path elements are leaking into
sys.pathin the first place.
Yet to clarify if this.
- Is it because
siteis being explictly brought in, in my hacked._pthfile? - How is
sys.pathdifferent if no._pthfile is ever there?
Ideally, these changes wouldn't be something baked into the Briefcase sources either - they'd be something included in the briefcase template so that an end-user can easily customize the contents.
Understood and generally agreed (even though, as a side comment, I'd like to have briefcase embed/bundle the "default" templates itself, such that it can be used completely offline -- a whole different topic and subsequent discussion). :)
I will investigate further and share my findings.
Investigation
Environment:
- Windows 10 + www.python.org's 64 bit Python 3.6 installed for current user at
C:\Users\test\AppData\Local\Programs\Python\Python36>. - Working with current Mu Editor and
briefcasemaster. - Created a minimal
pyproject.tomlfile. - Repository root at
C:\Users\test\work\github.com\mu. - Working with hacked
briefcase-windows-msi-templatethat produces per-user MSI installers (see #382).
Objective:
- Must be able to import
PyQt5andwin32api. -
sys.pathmust not include PATHs outside of the application directory.
Starting Point: current master
Package Source Observations
(after running briefcase package)
C:\Users\test\work\github.com\mu>"windows\Mu Editor\src\python\python.exe" -q
>>> import sys
>>> print(*map(repr, sys.path), sep='\n')
'C:\\Users\\test\\work\\github.com\\mu\\windows\\Mu Editor\\src\\python\\python36.zip'
'C:\\Users\\test\\work\\github.com\\mu\\windows\\Mu Editor\\src\\python'
'C:\\Users\\test\\work\\github.com\\mu\\windows\\Mu Editor\\src\\\\app'
'C:\\Users\\test\\work\\github.com\\mu\\windows\\Mu Editor\\src\\\\app_packages'
>>> import PyQt5
>>> PyQt5
<module 'PyQt5' from 'C:\\Users\\test\\work\\github.com\\mu\\windows\\Mu Editor\\src\\\\app_packages\\PyQt5\\__init__.py'>
>>> import win32api
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ModuleNotFoundError: No module named 'win32api'
Summary:
-
sys.pathlooks good. - Imports
PyQt5from the right source. - Fails at importing
win32api.
User-Installed Package Observations
(after installing MSI package)
C:\Users\test>"AppData\Local\Programs\Mu Editor\python\python.exe" -q
>>> import sys
>>> print(*map(repr, sys.path), sep='\n')
'C:\\Users\\test\\AppData\\Local\\Programs\\Mu Editor\\python\\python36.zip'
'C:\\Users\\test\\AppData\\Local\\Programs\\Mu Editor\\python'
'C:\\Users\\test\\AppData\\Local\\Programs\\Mu Editor\\\\app'
'C:\\Users\\test\\AppData\\Local\\Programs\\Mu Editor\\\\app_packages'
>>>
>>> import PyQt5
>>> PyQt5
<module 'PyQt5' from 'C:\\Users\\test\\AppData\\Local\\Programs\\Mu Editor\\\\app_packages\\PyQt5\\__init__.py'>
>>>
>>> import win32api
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ModuleNotFoundError: No module named 'win32api'
Summary:
-
sys.pathlooks good. - Imports
PyQt5from the right source. - Fails at importing
win32api.
Where to, next?
Facts:
- The
sitemodule processes.pthfiles (used bypywin32to providewin32api, here). - The presence of
python36._pthpreventssitefrom being auto-imported.
Options:
- Add
import sitetopython36._pth. - Drop the
python36._pthfile.
If we go with 2., a way of adding src\app and src\app_packages to sys.path must be put in place. Options:
- Create a
sitecustomize.pythat adds them to the PATH. - Add a
.pthfile insrc\python\lib\site-packagespointing to them.
Option 1 - Add import site to python36._pth
Contents of hacked python36._pth after the change:
python36.zip
.
..\\app
..\\app_packages
import site
Package Source Observations
(after running briefcase package)
C:\Users\test\work\github.com\mu>"windows\Mu Editor\src\python\python.exe" -q
>>> import sys
>>> print(*map(repr, sys.path), sep='\n')
'C:\\Users\\test\\work\\github.com\\mu\\windows\\Mu Editor\\src\\python\\python36.zip'
'C:\\Users\\test\\work\\github.com\\mu\\windows\\Mu Editor\\src\\python'
'C:\\Users\\test\\work\\github.com\\mu\\windows\\Mu Editor\\src\\app'
'C:\\Users\\test\\work\\github.com\\mu\\windows\\Mu Editor\\src\\app_packages'
'C:\\Users\\test\\AppData\\Roaming\\Python\\Python36\\site-packages'
>>>
>>> import PyQt5
>>> PyQt5
<module 'PyQt5' from 'C:\\Users\\test\\work\\github.com\\mu\\windows\\Mu Editor\\src\\app_packages\\PyQt5\\__init__.py'>
>>>
>>> import win32api
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ModuleNotFoundError: No module named 'win32api'
Summary:
-
sys.pathnow includes a foreign PATH. - Imports
PyQt5from the right source. - Fails at importing
win32api.
User-Installed Package Observations
(after installing MSI package)
C:\Users\test>"AppData\Local\Programs\Mu Editor\python\python.exe" -q
>>> import sys
>>> print(*map(repr, sys.path), sep='\n')
'C:\\Users\\test\\AppData\\Local\\Programs\\Mu Editor\\python\\python36.zip'
'C:\\Users\\test\\AppData\\Local\\Programs\\Mu Editor\\python'
'C:\\Users\\test\\AppData\\Local\\Programs\\Mu Editor\\app'
'C:\\Users\\test\\AppData\\Local\\Programs\\Mu Editor\\app_packages'
'C:\\Users\\test\\AppData\\Roaming\\Python\\Python36\\site-packages'
>>>
>>> import PyQt5
>>> PyQt5
<module 'PyQt5' from 'C:\\Users\\test\\AppData\\Local\\Programs\\Mu Editor\\app_packages\\PyQt5\\__init__.py'>
>>>
>>> import win32api
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ModuleNotFoundError: No module named 'win32api'
Summary:
-
sys.pathnow includes a foreign PATH. - Imports
PyQt5from the right source. - Fails at importing
win32api.
Add import site to python36._pth summary
Positive:
- Nothing.
Negative:
-
sys.pathnow polluted.
Thoughts:
- For some reason, importing
site, did not process the.pthfiles insrc\app_packages. - Tried bringing the
import siteline inpython36._pthup but observed nothing different. - Adding a
sitecustomize.pyfrom this point on might help -- TODO?
Option 2 - Drop the python36._pth file.
Package Source Observations
(after running briefcase package)
C:\Users\test\work\github.com\mu>"windows\Mu Editor\src\python\python.exe" -q
>>> import sys
>>> print(*map(repr, sys.path), sep='\n')
''
'C:\\Users\\test\\work\\github.com\\mu\\windows\\Mu Editor\\src\\python\\python36.zip'
'C:\\Users\\test\\work\\github.com\\mu\\windows\\Mu Editor\\src\\python\\DLLs'
'C:\\Users\\test\\work\\github.com\\mu\\windows\\Mu Editor\\src\\python\\lib'
'C:\\Users\\test\\work\\github.com\\mu\\windows\\Mu Editor\\src\\python'
'C:\\Users\\test\\AppData\\Roaming\\Python\\Python36\\site-packages'
>>>
>>> import PyQt5
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ModuleNotFoundError: No module named 'PyQt5'
>>>
>>> import win32api
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ModuleNotFoundError: No module named 'win32api'
Summary:
-
sys.pathmissingappandapp_packagesand polluted with''and local python installationsite-packages. - None of the
PyQt5/win32apiimports work.
User-Installed Package Observations
(didn't even try)
Drop the python36._pth file summary
Positive:
- Nothing.
Negative:
-
sys.pathmissing PATHs and polluted - None if the imports work.
Thoughts:
- No solution was expected from this: just focused on observing behaviour.
- Will try using the default www.python.org supplied
python36._pthnext. - Adding a
sitecustomize.pyfrom this point on might help -- TODO?
Option 2a - Default python36._pth file.
Contents, as supplied in the Python embeddable package:
python36.zip
.
# Uncomment to run site.main() automatically
#import site
Package Source Observations
(after running briefcase package)
C:\Users\test\work\github.com\mu>"windows\Mu Editor\src\python\python.exe" -q
>>> import sys
>>> print(*map(repr, sys.path), sep='\n')
'C:\\Users\\test\\work\\github.com\\mu\\windows\\Mu Editor\\src\\python\\python36.zip'
'C:\\Users\\test\\work\\github.com\\mu\\windows\\Mu Editor\\src\\python'
>>>
>>> import PyQt5
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ModuleNotFoundError: No module named 'PyQt5'
>>>
>>> import win32api
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ModuleNotFoundError: No module named 'win32api'
Summary:
-
sys.pathlooking good, but obviously missing PATHs. - None of the
PyQt5/win32apiimports work.
User-Installed Package Observations
(didn't even try)
Default python36._pth file summary
Positive:
- Nothing.
Negative:
- Imports don't work.
Thoughts:
- No solution was expected from this: just focused on observing behaviour.
- Adding
import sitetopython36._pthandsitecustomize.pyfrom this point might help -- TODO?
Stop and Think
Issue
From the experiments above, as soon as site is auto-imported -- either by explicitly adding it to python36._pth, or by droping that file completely -- sys.path becomes polluted with non-local PATHs.
Explanation:
-
sitedoes it when it finds a user base andsite-packagesby looking at the result ofsysconfig.get_config_var('userbase')andsysconfig.get_path('purelib', 'nt_user')- see code ingetuserbaseandgetusersitepackages.
Issue
Understand how/when site processes .pth files and why our option 1 above -- adding import site to python36._pth -- apparently did not process the .pth files in src\app_packages.
Explanation:
-
.pthfile processing is handled inaddsitedirthat delegates work toaddpackage. -
addsitediris called byaddsitepackagesandaddusersitepackages. - Both are called from
mainthat is called on import, conditionally here:if not sys.flags.no_site: main() - One could wonder if
mainactually ran: it feels safe saying so, given that our option 1 resulted in a pollutedsys.paththat onlysitecould have achieved. - Thus, for some reason,
addsitedirwas never called with the custom PATHs inpython36._pth:..\appand..\app_packages.
...time passes ...code is read ...hacked with ...and print-debuged (is there a better way?) :-)
Culprit:
-
siteis indeed imported. -
mainis indeed run. -
addsitepackagesis called, however:- The custom PATHs from
python36._pthare present in theknown_pathsargument. - The code only adds new PATHs -- sourced from
getsitepackages-- and only those new PATHs are passed toaddsitedir. - Thus, whichever PATHs are in
sys.pathwhensiteis imported are never processed for.pthfiles.
- The custom PATHs from
Status
Apparent Scenario
- Must auto-import
sitesuch that.pthfiles are handled. - The custom PATHs in
python36._pthare not processed for.pthfiles. - This pollutes
sys.paththat will need cleaning.
Possible ways forward
A. "Kind of ugly" option
- Add
import siteto currentpython36._pth, with the custom PATHs. - Create a
sitecustomize.pythat both (a) callssite.addpackageto process.pthfiles in..\app_packagesto further populatesys.pathand (b) cleans up the pollutedsys.path.
B. "Might be nice but won't work" option
- Remove custom PATHS from
python36._pthand addimport siteto it. - Add a
src\python\lib\site-packges\briefcase.pthwith relative paths tosrc\appandsrc\app_packages. - Would be elegant but
.pthfiles are not processed recursively. Thus, the.pthfiles insrc\app_packages-- the ones we really care about -- would not be processed.
(this was a close one! oh, frustration!)
C. "Not sure if its really that bad, but don't like it very much" option
- Set
briefcase.toml'sapp_packages_pathtosrc\python\lib\site-packages. - Remove custom PATHS from
python36._pthand addimport siteto it. - Create a
sitecustomize.pythat cleans upsys.pathfromsite-pulluted entries.
(may limit updating the support package, like @freakboy3742 noted above -- then again, maybe not: support package isn't supposed to touch ...\lib\site-packages which is "local" by definition).
D. "What about a venv, which feels solid, but will probably be a mess" option.
- Create a virtual environment to host the application and dependencies.
- Use that venv's
pythonto install dependencies. - Move/copy application package into that venv's
site-packages-- why not? - No PATH fiddling, I suppose -- wondering if venv's
sys.pathwould include foreign PATHs? - Why this isn't as clean/easy as it might be:
- The support package's
pythondoes not include thevenvmodule on windows. :( - No way this could be done when targeting a foreign architecture -- can't run
pythonto create venv and thenpip installthings.
- The support package's
E. "Is there any other option" option
- Take a rest and see if letting the mind out of this for a while helps ideas settle down and or pop.
:-)
I had the same issue with packaging pywin32 in Briefcase and as @tmontes figured it out correctly, *.pth files are correctly packed by Briefcase, but the link to those *.pth-files is wrong.
The python package site and its variable site.USER_SITE points to a incorrect path (for me it was C:\\Users\\RuneMonzel\\AppData\\Roaming\\Python\\Python38\\site-packages) and thus all .pth in the app_packages folder will not be loaded correctly. However, python's documentation says you can modify the import behaviour with the site package: https://docs.python.org/3/library/site.html
This fix should be applied if any module does extra imports via a .pth file, which is normally located in site-packages (or when using Briefcase in app_packages).
A fast fix:
try:
import win32con, win32event, win32process
from win32com.shell.shell import ShellExecuteEx
from win32com.shell import shellcon
except ModuleNotFoundError:
print("Try to find 'app_packages' folder and to add this to python's 'site' package.")
app_packages = ""
for path in sys.path:
if path.endswith("app_packages"):
app_packages = path
if app_packages == "":
raise ModuleNotFoundError
else:
import site
site.USER_SITE = app_packages # correct the 'site-packages' path to 'app_packages' path
site.main() # recall site package thus all .pth in 'app_packages' will be add to sys.path
import win32con, win32event, win32process
from win32com.shell.shell import ShellExecuteEx
from win32com.shell import shellcon
for path in sys.path:
print(path)
The output of the code above shows, that sys.path is now extended with the paths found in pywin32.pth:
Try to find 'app_packages' folder and to add this to python's 'site' package.
C:\Users\RuneMonzel\AppData\Local\Programs\AutoLucid\python\python38.zip
C:\Users\RuneMonzel\AppData\Local\Programs\AutoLucid\python
C:\Users\RuneMonzel\AppData\Local\Programs\AutoLucid\app
C:\Users\RuneMonzel\AppData\Local\Programs\AutoLucid\app_packages
C:\Users\RuneMonzel\AppData\Local\Programs\AutoLucid\app_packages\win32
C:\Users\RuneMonzel\AppData\Local\Programs\AutoLucid\app_packages\win32\lib
C:\Users\RuneMonzel\AppData\Local\Programs\AutoLucid\app_packages\Pythonwin
@freakboy3742 :
I am wondering if it is possible to change the site.USER_SITE variable to the folder app_packages in the standard cookie-cutter template? Thus all python modules with .pth dependencies would be imported normally.
UPDATE: The following does not work because pythonXX._pth does only accept import site:
One way would be importing an extra module in pythonXX._pth, like import sitecustomize which calls this kind of code:
import sys
import site
app_packages = ""
for path in sys.path:
if path.endswith("app_packages"):
app_packages = path
if app_packages != "":
site.USER_SITE = app_packages # correct the 'site-packages' path to 'app_packages' path
site.main() # recall site package thus all .pth in 'app_packages' will be called
@monzelr Thanks for the extra detail. I think this may be tracking the same problem as #669; and yes - this is absolutely something that should be fixed. The general approach you've described makes sense; we'll need to find a good place to drop a sitecustomize script so it is picked up on all platforms.