st2 icon indicating copy to clipboard operation
st2 copied to clipboard

Using a different python version for a pack is no longer working. actionrunner.python_binary

Open wingiti opened this issue 4 years ago • 13 comments

SUMMARY

Using a different python version (in my case 3.7) for packs instead of the default system one (3.6) is not working. Till Stackstorm 3.4 this worked like described in the docs: https://docs.stackstorm.com/latest/packs.html?highlight=python_binary#python-versions-in-pack-python-virtual-environment

STACKSTORM VERSION

Paste the output of st2 --version: st2 3.5.0, on Python 3.6.8

OS, environment, install method

I think it doesn't matter. I tried on our System RHEL 7 and the OVA image of Stackstorm 3.5.0

Steps to reproduce the problem

Add a different python binary to st2.conf for packs:

[actionrunner]
python_binary = /usr/bin/python3.7

Expected Results

The pack should be run using python 3.7 and finish successful.

Actual Results

The execution stops and throws an error:

Traceback (most recent call last):
  File "/opt/stackstorm/st2/lib/python3.6/site-packages/python_runner/python_action_wrapper.py", line 35, in <module>
    import orjson
ModuleNotFoundError: No module named 'orjson'

Using some debug output in my action I was able to verify that the binary mentioned in the config is used.

As further debug step I tried following config and installed Stackstorm requirements for python3.7:

[actionrunner]
python_binary = /usr/bin/python3.7
virtualenv_opts =  --system-site-packages

It resulted in a different error:

Traceback (most recent call last):
  File "/opt/stackstorm/st2/lib/python3.6/site-packages/python_runner/python_action_wrapper.py", line 59, in <module>
    from st2common import log as logging
  File "/opt/stackstorm/st2/lib/python3.6/site-packages/st2common/log.py", line 32, in <module>
    from st2common.logging.handlers import FormatNamedFileHandler
  File "/opt/stackstorm/st2/lib/python3.6/site-packages/st2common/logging/handlers.py", line 24, in <module>
    from st2common.util import date as date_utils
  File "/opt/stackstorm/st2/lib/python3.6/site-packages/st2common/util/date.py", line 24, in <module>
    import udatetime
  File "/opt/stackstorm/st2/lib/python3.6/site-packages/udatetime/__init__.py", line 20, in <module>
    from udatetime.rfc3339 import (
ModuleNotFoundError: No module named 'udatetime.rfc3339'

udatetime was also part of the installed requirements. Therefore I stopped debugging at this point in time and hope you can help to solve it as I don't know which change resulted in this issue.

In addition I opened a post in the forum, but I think this GIT issue is maybe the better place as it is still an option mentioned in the docs and no longer working. https://forum.stackstorm.com/t/how-to-use-different-python-version-v3-7-for-actionrunner-python-binary-no-longer-working/1786

wingiti avatar Jul 27 '21 15:07 wingiti

@Kami What were the alternatives for udatetime? I know we tested this under Python3.6/Python3.8, could the issue here be the pip version used to create the python3.7 virtualenv?

nzlosh avatar Jul 27 '21 17:07 nzlosh

The pip version we have used is 20.3.3. There were some errors reported that were similar, that I think were resolved by moving to 20.3.3. It wasn't trying to use python 3.7, but checking pip version might help.

amanda11 avatar Jul 27 '21 18:07 amanda11

I have now tried the package installation after upgrade to pip version 20.3.3 and reinstalled the requirements as well. Same result, no progress. In addition I tried to use python 3.9 instead of 3.7 for actionrunner which showed the same errors. Can maybe someone verify on his installation if the same errors occur?

wingiti avatar Jul 28 '21 15:07 wingiti

To confirm whether its the specific python version its probably worth setting the alernate python library to 3.8 if that's possible (on a system whereby the defualt ST2 python is 3.6). I will try and find time to see if I can reproduce in that environment tomorrow.

amanda11 avatar Jul 29 '21 16:07 amanda11

I just tried with on an CentOS 8 system running 3.6dev, adding python3.8, installing openldap-devel rpm, pip3.8 install of ST2 requirements, and then pointing the python_binary to 3.6.

Then installed new pack and ran a python action...

Getting similar problem, though I'm always getting the complaint about missing orjson... 1.

{
  "stdout": "",
  "stderr": "Traceback (most recent call last):
  File \"/opt/stackstorm/st2/lib/python3.6/site-packages/python_runner/python_action_wrapper.py\", line 35, in <module>
    import orjson
ModuleNotFoundError: No module named 'orjson'
",
  "exit_code": 1,
  "result": "None"
}
  1. Got same problem when I used --system-site-packages as well...

amanda11 avatar Jul 30 '21 11:07 amanda11

Looking in the st2actionrunner then we can find the python path and python exe that it uses, e.g. sudo grep "Running command" /var/log/st2/st2action*2225320.log

and if I set PYTHONPATH to that value, and then run the python exe, and import orjson then I get the same problem:

# export PYTHONPATH=/opt/stackstorm/virtualenvs/excel/lib/python3.8:/opt/stackstorm/virtualenvs/excel/lib/python3.8/site-packages:/opt/stackstorm/packs/excel/actions/lib:/opt/stackstorm/st2/lib/python3.6/site-packages
# /opt/stackstorm/virtualenvs/excel/bin/python
Python 3.8.11 (default, Jul 29 2021, 16:31:09) 
[GCC 8.4.1 20200928 (Red Hat 8.4.1-1)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import orjson
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ModuleNotFoundError: No module named 'orjson'

I think the problem is that the orjson that is under ST2 is for python3.6 only, as it's a C library...

# ls /opt/stackstorm/st2/lib/python3.6/site-packages/orj*
/opt/stackstorm/st2/lib/python3.6/site-packages/orjson.cpython-36m-x86_64-linux-gnu.so

/opt/stackstorm/st2/lib/python3.6/site-packages/orjson-3.5.2.dist-info:

When you use the --system-site-packages I think you managed to get that copied to the pack (I didn't but I've been messing around with options at moment). I just hacked to see what happened if I copied that orjson c library into the excel pack virtualenv, and then I got the udatetime error... I then copied over the udatetiem including its rfc3339 c library, and then the pack worked...

amanda11 avatar Jul 30 '21 12:07 amanda11

So I think @wingiti your copy of the system-site-packages was the right thing to do. The question then is why that didn't copy over the udatetime, so make sure that you had the udatetime and in particular that you are getting the: rfc3339.cpython-37-x86_64-linux-gnu.so file in the path.

The problem is that with the c libraries it can't fall back to getting them from the ST2 area, as we need to make sure that the py3.7 (or whatever alternative python is being used) C libraries are in place for it.

Possibly just getting the py3.7 .so for those two modules in the st2 path might also work...

If you can experiment with this and report back, then we can probalby work out how best to handle this scenario. But it's due to 3.5 using some modules that use c libraries.

amanda11 avatar Jul 30 '21 12:07 amanda11

@amanda11 Thank you for your effort. I can confirm, if I copy the udatetime python3.7 filerfc3339.cpython-37m-x86_64-linux-gnu.so to /opt/stackstorm/st2/lib/python3.6/site-packages/udatetime/ the action is running.

What I don't understand for now is why I need to do this? I installed orjson package on a systemlevel for python3.7 and it is used by stackstorm since I added --system-site-packages option. I installed udatetime the same way for python3.7 and this one is not used. I can even start python3.7 and import both packages without an issue.

wingiti avatar Jul 30 '21 14:07 wingiti

Yes - I don't understand why orjson c library got picked up but udatetime didn't. The problem is effectively the same, so I couldn't figure that out.

you could try uninstalling the pack and making sure udatetime python 3.7 c library was in place in the system site packages first, and trying again - to ensure it's not from the virtual environment being partially created first.

There's no special logic on our side so I see no reason why the method that fixed orjson didn't fix udatetime.

amanda11 avatar Jul 30 '21 15:07 amanda11

If I directly compare the situation of these two packages, the issue might be that the search path for packages points to /opt/stackstorm/st2/lib/python3.6/site-packages/. orjson is only one compiled c library which should be imported. As python 3.7 can not import the compiled version for 3.6 it searches afterwards in the python3.7/site-packages/.

udatetime.rfc3339 is also a compiled c library but only the rfc3339 module. Therefore python3.7 finds the udatetime directory in python3.6/site-packages/ and starts importing init.py which fails because it can not import the module rfc3339. My guess is now, that udatetime is not searched in python3.7/site-packages/ because it was already found in python3.6. This thesis is supported by the fact that it is also working if I rename the udatetime directory in python3.6/site-packages/. After that udatetime will be used from python3.7/site-packages/

Issue might be identified, but I don't have a solution for this. Maybe you have some ideas?

wingiti avatar Aug 02 '21 14:08 wingiti

Your explanation of the difference sounds plausible. And therefore related to the order it imports packages.

CC @blag have you got any thoughts from when you looked at last looked in this area. Be interested in your thoughts.

With different python versions, then the problem looks like with udatetime is that it finds it in the ST2 python path - but that has the python3.6 C library. As in this install the action runner is set to use python3.7 if fails the import. (If the python 3.7 C library for udatetime is added into that ST2 python path alongside the python3.6 C library - then all is good).

@wingiti If you amend the requirements of a pack to include udatetime, does it force it then to load into the pack's vritualenv, and then pick up the correct version of udatetime?

@blag I think perhaps we need to put back in some logic that we had when we supported --python3. In that case what we did was add the python3 site-packages into the virtualenv's pythonpath, so that they got picked up before the python2.7 ones. I think we might need to do similar if an install is using any value for python_binary. e.g. if you specify python_binary, then we should add in that site-packages to the pythonpath. That way they will get added to the PYTHONPATH ahead of the ST2 packages, e.g. similar to https://github.com/StackStorm/st2/blob/1689445033c8243a7ba7bafc23497feed731ff4d/st2common/st2common/util/sandboxing.py#L158-L161

or https://github.com/StackStorm/st2/blob/1689445033c8243a7ba7bafc23497feed731ff4d/st2common/st2common/util/sandboxing.py#L178-L181

amanda11 avatar Aug 06 '21 16:08 amanda11

@wingiti We welcome PR contributions, if you would be interested in working on a resolution instead of the workaround. I believe that if we add into the class I highlighted above similar code to what we had when --puthon3 was used it would resolve the issue you are seeing. I.e. that if an alternative python exe is specified then we amend the pythonpath to add the library path for that python version ahead of the default st2 python lib. Then it should find your python 3.7 libraries first.

amanda11 avatar Aug 18 '21 10:08 amanda11

@amanda11 I tried to build a fix for this issue, see Pull Request: https://github.com/StackStorm/st2/pull/5388 Maybe you can review it and tell me what you think about it. To be honest I am not sure about the possibilities, where site-packages can be located or if there is a more elegant way to identify the correct path. I just checked it for Ubuntu and Redhat, so far it worked.

wingiti avatar Oct 12 '21 20:10 wingiti