rclcpp icon indicating copy to clipboard operation
rclcpp copied to clipboard

How to support standard messages with zero-copy transports

Open alsora opened this issue 2 years ago • 11 comments

Most of the available RMW implementations now support some sort of zero-copy transport for multiple processes in the same machine. However, to use this feature, there's an important limitation: the size of the messages must be known at compile time, so they can't contain variable-length sequences.

This effectively makes it impossible to use the vast majority of ROS 2 standard messages: indeed, the presence of a header (with its string frame_id field) doesn't respect the requirement.

We are working with eProsima to find a solution to this problem. Our plan is to modify the rosidl code and allow to automatically bound the unbounded elements.

For example: assume that all strings are capped at X characters. This approach wouldn't affect the application code, that would still work with strings. It would work under the hood when a message with a string would need to be published via shared memory.

We are discussing different implementations, with an upper bound that can be defined either at runtime or at build time and with different possible fallback mechanisms (i.e. what to do if the string exceeds the upper bound?).

We would like to gather feedbacks from the community about this, which in our opinion is a required feature in order to enable multi-process ROS 2 applications.

alsora avatar Jun 02 '23 16:06 alsora

Our plan is to modify the rosidl code and allow to automatically bound the unbounded elements.

I don't think we should do this, or, at least, we should not do this by default.

I definitely understand the desire to want these things to be bounded for the zero-copy case. But silently making them bounded in the background is going to lead to a lot of confused users later on. There are a number of ways I could see us going here:

  1. Automatically generate bounds for unbounded types. This would make the zero-copy case better, at the expense of unbounded uses not actually being unbounded.
  2. Have a special rosidl "mode" where we automatically generate bounds for unbounded types, and leave the default as-is. This keeps the existing behaviour intact, while allowing those who want true zero-copy to have it. The downside here is that true zero-copy users will have to compile from source always.
  3. Start changing most of the ROS 2 core messages to actually be bounded. For instance, we could change std_msgs/msg/Header to have a bounded string for the frame_id directly in the message. I think this is superior in most ways to case 1 above, as we are explicit about where the bounds are. The downside here is that we have to have a large transition period where we convert all of the messages over to be bounded, and this really only helps the messages in the ROS 2 core.

There may be other ways to go. But I think we should have a conversation about this, either in the ROS 2 weekly meeting or in the client libraries working group, as may potentially have wide-ranging implications.

clalancette avatar Jun 04 '23 18:06 clalancette

Have a special rosidl "mode" where we automatically generate bounds for unbounded types, and leave the default as-is. This keeps the existing behaviour intact, while allowing those who want true zero-copy to have it. The downside here is that true zero-copy users will have to compile from source always.

This is the type of approach that we are investigating. We are exploring both compile time and runtime configurations, but always keeping the default behavior unchanged. (for example: an env variable that defines the upper bound and by default it's "no bound")

Start changing most of the ROS 2 core messages to actually be bounded.

This seems problematic to me. It would likely be not backward-compatible with the existing implementation.

I think we should have a conversation about this, either in the ROS 2 weekly meeting or in the client libraries working group

Definitely! We'll try to get some more concrete idea in the next weeks and then present it to the community.

alsora avatar Jun 04 '23 20:06 alsora

We have been working on a PoC for this. See ros2/rosidl#758 and ros2/rosidl_typesupport_fastrtps#106 for the relevant changes.

I'm attaching a ZIP file prepared by my colleague @EduPonz with a README.md and a docker compose project that demonstrates the usage of the Zero-Copy compatible ROS 2 types with strings. Getting the demo up and running is just a matter of running a docker compose up.

The ZIP also contains two .repos files for VCS, one with the three repos that are needed for the feature, and another one for re-building the common interfaces and the demos so you can see the feature in action.

Please do let us know what you think!

ros2_fixed_strings.zip

MiguelCompany avatar Jul 18 '23 12:07 MiguelCompany

@allenh1 I've seen this blog post of yours, and I think it aligns with the work being done here.

Are you planning on open-sourcing the work described in that post? Would you be willing to contribute changes on the relevant ROS 2 repos (namely rosidl) ?

MiguelCompany avatar Jul 28 '23 07:07 MiguelCompany

Are you planning on open-sourcing the work described in that post?

@MiguelCompany At the moment, there is no plan to open source the work described in that post.

Our implementation relies on a number of assumptions we can make about the memory layout of the typesupport representation for the C++ messages. Specifically, it relies on them being the same. So any middleware without a way to ensure the typesupport representation is the same memory layout as the C++ generated messages will encounter issues.

The other important detail is the StorageBase class mentioned in that blog. This mechanism allows us to wrap an array as a bounded vector, and use that for bounded sequences and bounded strings (which is slightly nicer than fixed strings and arrays). Since this makes the vectors contiguous, they can be allocated in the middleware, and used in the C++ messages directly.

I think a great first step would be to modify the rosidl_runtime_cpp bounded vector implementation to not use std::vector as a base, and do something like the StorageBase implementation described in my blog post, as well as creating an analog for strings.

One thing to keep in mind here is that the C++ messages, as well as the generated typesupport representation of the message, both depend on the same vector implementation. This implementation is then a dependency of the middlewares, as well as the message packages, so rosidl_runtime_cpp might not be the best place for that implementation.

allenh1 avatar Jul 28 '23 08:07 allenh1

@clalancette Could you take a look at the PoC mentioned in https://github.com/ros2/rclcpp/issues/2201#issuecomment-1640125818?

Would be nice to check whether the approach seems correct before going further with the implementation

MiguelCompany avatar Sep 07 '23 13:09 MiguelCompany

2. Have a special rosidl "mode" where we automatically generate bounds for unbounded types, and leave the default as-is. This keeps the existing behaviour intact, while allowing those who want true zero-copy to have it. The downside here is that true zero-copy users will have to compile from source always

Hi @clalancette ! May I know how to change rosidl mode for zero-copy use cases? Thank you!

homalozoa avatar Nov 10 '23 12:11 homalozoa

May I know how to change rosidl mode for zero-copy use cases?

@homalozoa

I think what @clalancette said in https://github.com/ros2/rclcpp/issues/2201#issuecomment-1575672304 is only proposal which is not implemanted yet.

ZhenshengLee avatar Nov 14 '23 02:11 ZhenshengLee

Hi guys, I would like to try it with [rclc] demo nodes if we had a c version of typesupport, (I've checked https://github.com/ros2/rosidl_typesupport_fastrtps/pull/106 and https://github.com/ros2/rosidl/pull/758, but I didn't find out how to use it with rclc 😸)

Zard-C avatar Nov 21 '23 08:11 Zard-C