ros2cli icon indicating copy to clipboard operation
ros2cli copied to clipboard

current ament_python template doesn't work for virtual environments

Open cortesvitor opened this issue 8 months ago • 2 comments

Operating System:

Ubuntu 24.04.1 LTS 'Noble'

ROS version or commit hash:

Jazzy

RMW implementation (if applicable):

rmw_fastrtps_cpp

RMW Configuration (if applicable):

No response

Client library (if applicable):

colcon

'ros2 doctor --report' output

ros2 doctor --report
<COPY OUTPUT HERE>

Steps to reproduce issue

  1. Create a workspace directory:
mkdir -p test_ws/src
  1. Navigate into the workspace and create and activate a Python virtual environment, including --system-site-packages (required for the build):
cd test_ws/
python3 -m venv .venv --system-site-packages
source .venv/bin/activate
  1. Source the ROS 2 Jazzy environment:
source /opt/ros/jazzy/setup.bash
  1. Navigate to the source directory and create a pure Python package test_pkg with an executable node test_node:
cd src/
ros2 pkg create --build-type ament_python test_pkg --node-name test_node
  1. Install a Python module (for example, pandas) using pip within the virtual environment:
pip3 install -U pandas
  1. Modify the test_node.py file to import the pandas module:
import pandas
  1. Build the workspace and then source the resulting environment:
colcon build
. install/setup.bash
  1. Attempt to run the test_node executable:
ros2 run test_pkg test_node

Expected behavior

The node should run without any import errors related to the pandas module and produce the following output:

Hi from test_pkg.

Actual behavior

The node fails to run and raises a ModuleNotFoundError for the pandas module, as shown in the traceback below:

Traceback (most recent call last):
  File "~/test_ws/install/test_pkg/lib/test_pkg/test_node", line 33, in <module>
    sys.exit(load_entry_point('test-pkg==0.0.0', 'console_scripts', 'test_node')())
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "~/test_ws/install/test_pkg/lib/test_pkg/test_node", line 25, in importlib_load_entry_point
    return next(matches).load()
           ^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/importlib/metadata/__init__.py", line 205, in load
    module = import_module(match.group('module'))
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/importlib/__init__.py", line 90, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<frozen importlib._bootstrap>", line 1387, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1360, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1331, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 935, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 995, in exec_module
  File "<frozen importlib._bootstrap>", line 488, in _call_with_frames_removed
  File "~/test_ws/install/test_pkg/lib/python3.12/site-packages/test_pkg/test_node.py", line 1, in <module>
    import pandas as pd
ModuleNotFoundError: No module named 'pandas'
[ros2run]: Process exited with failure 1

Additional information

On recent Ubuntu systems (24.04+), there's a shift away from installing Python packages globally, pushing users towards virtual environments for better project isolation. While ament_cmake packages correctly use the Python interpreter specified in the virtual environment (via the #!/usr/bin/env python3 shebang), ament_python packages might be mishandling this.

The problem arises during the colcon build process. Instead of embedding a shebang that points to the Python interpreter within the activated virtual environment, the build tools for ament_python seem to be hardcoding the system's default Python interpreter in the generated executable scripts.

Because of this, when we try to run a Python node from an ament_python package after installing a library like pandas only in the virtual environment, the system's Python is invoked. This system-level Python doesn't know about the packages installed in your isolated virtual environment, leading to the ModuleNotFoundError.

The reason this might not have been noticed earlier is likely because developers (including myself) may have had these Python libraries installed globally on their systems. In that scenario, regardless of which Python interpreter the script tried to use, the pandas library would have been accessible, masking the underlying shebang issue. Now that Ubuntu is more strictly recommending virtual environments, this discrepancy between ament_cmake and ament_python package handling is becoming apparent.

cortesvitor avatar May 12 '25 08:05 cortesvitor

I'm running into this issue now. Is this still considered a bug? I see both #1025 and ros2/ros2_documentation#5596 never merged.

If the desired functionality is to have the shebang of colcon's generated scripts reference the "current" python interpreter, I'd like to work on this.

Shane-Stevenson avatar Dec 02 '25 20:12 Shane-Stevenson

Is this still considered a bug?

In the sense that this is still a gaping hole for new users to step in when they deviate from the official installation instructions, this is still a bug. All of the individual pieces of machinery are functioning as intended but in the presence of a venv, the results can be difficult to unravel and understand.

@cortesvitor's original post is correct - the problems center around shebangs and packaging practices, and how they're fighting with the hidden mechanisms for how Python virtual environments affect the Python import search path.

If the desired functionality is to have the shebang of colcon's generated scripts reference the "current" python interpreter, I'd like to work on this.

It's worth calling out that we aren't doing anything atypical here. All of the shebangs on all of the scripts in the colcon debian packages are following standard Debian packaging practices, and all of the shebangs on all of the scripts created as colcon builds packages are handled by setuptools itself. As much as this might seem like a cut-and-dry bug under this context, patching the debs to use /usr/bin/env python3 will simply swap this "behavioral footgun" for others, particularly import errors invoking colcon itself when a venv is used.

I have some ideas on how we might improve this story when we make the move to the standards-based Python build pipeline where there is an explicit boundary between the build backend and frontend that we can take advantage of. Until then, my advice remains the same: If you're going to use a venv building ROS packages, use it for everything. Looking at this problem another way, the issues stem from an attempt to mix packaging systems. If all of the ROS Python dependencies (and colcon itself) are installed in the venv via pip and no executables from the system packages are invoked, you won't see these issues.

cottsay avatar Dec 15 '25 22:12 cottsay