rai icon indicating copy to clipboard operation
rai copied to clipboard

Refactor: Decouple vision components from ROS2 for broader reusability

Open Juliaj opened this issue 4 months ago • 0 comments

To support https://github.com/RobotecAI/rai/issues/697

Observations

Currently, the codebase for rai_open_set_vision couples core vision logic with ROS2 dependencies. For example, Box class in boxer.py contains ROS2 message conversion (to_detection_msg()). This leads to domain objects (BoundingBox, detection logic) cannot be used outside ROS2 environments.

For use cases of benchmarking models, build web APIs, or run in non-ROS2 contexts, developers must install ROS2 related packages. Unit testing may also require ROS2 infrastructure, even for pure algorithm tests

Decoupling options

Option A: Hard Decoupling

We could split the code into two packages: rai-perception-core and rai-perception-ros2.

rai-perception-core (PyPI - Pure Python)

  • Pure domain models: BoundingBox, Mask, etc. (no framework dependencies)
  • Core vision algorithms: ObjectDetector, Segmenter
  • Reusable in any Python context: CLI tools, batch processing

rai-perception-ros2 (ROS2 package)

  • BaseVisionAgent - ROS2 agent infrastructure (already ROS2-coupled, stays here)
  • ROS2DetectionAdapter - new code, converts domain objects ↔ ROS2 messages
  • ROS2-integrated agents: GDBoxer, GroundingDinoAgent, GroundedSamAgent This package depends on: rai-perception-core + ROS2

The hope is to provide core vision tools to a broader user base and achieve clearer separation between domain logic and integration layer, dependency flow: rai-perception-ros2rai-perception-core

Option B: Soft Decoupling

Based on the suggestion from @maciejmajek (rai_perception + rai_perception_msgs with guarded imports), this approach refactors the directory structure to conceptually separate core and ROS2 components.

High level structure

rai-perception          (rosdep + PyPI)
  ├── core/            (pure Python, internal module)
  └── ros2/            (ROS2 integration, guarded imports)

rai-perception-msgs    (rosdep only)
  └── msg/             (interface definitions)

We have a single rai_perception package which serves both Python users and ROS2 users. This approach optimizes for ROS2 users (the majority ?) while maintaining a path for future pure Python extraction if demand justifies the additional maintenance complexity. The tradeoff is that for pure Python users, even though they can do "pip install rai-perception", full installation still requires ROS2 dependencies and workspace setup for build/test environments.

Other Options

Suggestions/comments are more than welcome.

Juliaj avatar Oct 11 '25 20:10 Juliaj