Refactor: Decouple vision components from ROS2 for broader reusability
To support https://github.com/RobotecAI/rai/issues/697
Observations
Currently, the codebase for rai_open_set_vision couples core vision logic with ROS2 dependencies. For example, Box class in boxer.py contains ROS2 message conversion (to_detection_msg()). This leads to domain objects (BoundingBox, detection logic) cannot be used outside ROS2 environments.
For use cases of benchmarking models, build web APIs, or run in non-ROS2 contexts, developers must install ROS2 related packages. Unit testing may also require ROS2 infrastructure, even for pure algorithm tests
Decoupling options
Option A: Hard Decoupling
We could split the code into two packages: rai-perception-core and rai-perception-ros2.
rai-perception-core (PyPI - Pure Python)
- Pure domain models:
BoundingBox,Mask, etc. (no framework dependencies) - Core vision algorithms:
ObjectDetector,Segmenter - Reusable in any Python context: CLI tools, batch processing
rai-perception-ros2 (ROS2 package)
-
BaseVisionAgent- ROS2 agent infrastructure (already ROS2-coupled, stays here) -
ROS2DetectionAdapter- new code, converts domain objects ↔ ROS2 messages - ROS2-integrated agents:
GDBoxer,GroundingDinoAgent,GroundedSamAgentThis package depends on:rai-perception-core+ ROS2
The hope is to provide core vision tools to a broader user base and achieve clearer separation between domain logic and integration layer, dependency flow: rai-perception-ros2 → rai-perception-core
Option B: Soft Decoupling
Based on the suggestion from @maciejmajek (rai_perception + rai_perception_msgs with guarded imports), this approach refactors the directory structure to conceptually separate core and ROS2 components.
High level structure
rai-perception (rosdep + PyPI)
├── core/ (pure Python, internal module)
└── ros2/ (ROS2 integration, guarded imports)
rai-perception-msgs (rosdep only)
└── msg/ (interface definitions)
We have a single rai_perception package which serves both Python users and ROS2 users. This approach optimizes for ROS2 users (the majority ?) while maintaining a path for future pure Python extraction if demand justifies the additional maintenance complexity. The tradeoff is that for pure Python users, even though they can do "pip install rai-perception", full installation still requires ROS2 dependencies and workspace setup for build/test environments.
Other Options
Suggestions/comments are more than welcome.