UR5e dangerously moves to (0,0,0,0,0,0) when External Control URCap is started [ROS2 Galactic]
We've noticed a potentially hazardous edge case when our ros2_control nodes (hardware interface, controller manager, etc) are started after Polyscope is booted up, but before the arm is powered on and enabled if Polyscope is set to Local mode.
With that specific startup sequence, the hardware_interface for the arm initializing joint position states and commands as 0s, without any errors or warnings that a connection to the arm could not be established. The hardware completes its initialization without error, enters an active state. Then, when the External Control UR Cap is enabled, the arm tries to quickly move to the faulty initial position/command, in our case crashing into the ground.
Its strange to me that the initial read() call is successfully getting a response from the arm, but appears to be unsuccessfully getting real data. Potentially unrelated, but I also noticed that the "write" method doesn't care if the "initialized" flag has been set or not (but does look for the "runtime_state_" which appears to be read as part of the read method that needs to happen before initialized is set, so maybe that's a non-issue)...
For reference, we are running ros2 Galactic in an Ubuntu 20.04 docker container. We also do not experience this issue when the arm is started in Remote mode, or if it is fully started in Local mode before our ros2 nodes are started.
At this point we've updated our operating procedure to very explicitly avoid this situation and have written a script to loudly inform us when this strange "perfect 0.0 state" is detected, but it would be great to have a more direct safeguard in the hardware interface itself, e.g. making sure that the position feedback is correctly read.
Potentially tangentially related to https://github.com/UniversalRobots/Universal_Robots_ROS2_Driver/issues/515?
Thanks for reporting this
This should really not happen.Will try to reproduce this.
Could be a controller issue. fzi-forschungszentrum-informatik/cartesian_controllers had a similar case reported earlier: https://github.com/fzi-forschungszentrum-informatik/cartesian_controllers/pull/78.
Edit: and could also not be, seeing the analysis @shonigmann included.
A very interesting question related would be which controller you are using. I am assuming the scaled_joint_trajectory_controller?
I have not been able to reproduce this. The setup / sequence I used
- UR16e with Polyscope 5.11
- Ubuntu 20.04 with ROS Galactic. Driver built from source from 8d0d022
- Robot booted without any further interaction, Robot in local (manual) mode
- Started the driver using
ros2 launch ur_bringup ur16e.launch.py robot_ip:=192.168.0.161 launch_rviz:=true - RViz shows the robot in all zeros pose
- Initialize robot on the TP. As soon as the robot reaches "Booting complete" RViz shows the correct joint configuration
- Start the robot using the TP (Releasing the breaks)
- Pressing play on the TP.
- Driver says
[ur_ros2_control_node-1] [INFO] [1667479673.944114292] [UR_Client_Library]: Robot requested program [ur_ros2_control_node-1] [INFO] [1667479673.944155412] [UR_Client_Library]: Sent program to robot [ur_ros2_control_node-1] [INFO] [1667479674.050829457] [UR_Client_Library]: Robot connected to reverse interface. Ready to receive control commands. - Normal and expected behavior.
@shonigmann Does the configuration shown in RViz also change to the correct one as soon as the robot is booted (before the breaks are released)?
thank you for the responses - i've been out on holiday, but will try and respond with additional details next week
I observed the same. I think it can be reproduced like this.
Start ros2 launch ur_robot_driver ur_control.launch.py ur_type:=ur3e robot_ip:=192.168.1.102 launch_rviz:=false and start the External Control to be connected to the UR. Then Move the UR (RViz or Script). Then move the UR with the Freedrive and the External Control stops but the ur_robot_driver is still active. Then start the External Control again and the Robot moves back to the previous position when the External Control was still active real fast.
@LucaBross This is exactly what should not be happening. I'll try again to reproduce this later this week.
Apologies for the delay - adding some info for reference:
Controller: joint_trajectory_controller, not scaled_joint_trajectory_controller
OS: Ubuntu 20.04.5 LTS (Docker Container)
Universal Robots Software: URSoftware 5.11.0.108249
Universal Robots ROS2 Driver: d8acc7a and built from source
Moveit Version: 2.3.4.1
ROS Control Dependency Versions:
ros-galactic-control-msgs/focal,now 3.0.0-2focal.20220729.135512 amd64 [installed]
ros-galactic-controller-interface/focal,now 1.6.0-1focal.20220803.125919 amd64 [installed]
ros-galactic-controller-manager-msgs/focal,now 1.6.0-1focal.20220803.124803 amd64 [installed]
ros-galactic-controller-manager/focal,now 1.6.0-1focal.20220803.131306 amd64 [installed]
ros-galactic-joint-trajectory-controller/focal,now 1.4.0-1focal.20220803.131802 amd64 [installed]
ros-galactic-hardware-interface/focal,now 1.6.0-1focal.20220803.125502 amd64 [installed]
@shonigmann Does the configuration shown in RViz also change to the correct one as soon as the robot is booted (before the breaks are released)?
From memory, no, RViz did not show the correct configuration - the joint states were still being published as 0's. But I will try to find time to reproduce to give a more certain response.
We'll try to find some time to setup a minimum reproduceable example, though we've paused our work with the UR5 for now and may not be able to dedicate time to this in the near future.
Any reason you are using the joint_trajectory_controller, not scaled_joint_trajectory_controller? I highly recommend using the scaled version. See the explanation here.
Still, this initialization problem should not happen, but I haven't been able to reproduce this, so far.
We are currently adding a failsafe mechanism in https://github.com/UniversalRobots/Universal_Robots_Client_Library/pull/184. From a ROS perspective this approach seems not to be satisfying, but in the end it is a safety mechanism which should never trigger if everything works correctly.
Thanks for the update @fmauch. I was chatting with @urrsk and he told me there maybe a corresponding fix to ros2_control as well. Could you please share that? Appreciate your help.
The corresponding work in ros2_control was started in https://github.com/ros-controls/ros2_control/pull/884
There are still a couple of pieces missing in order to finally implement that we can specify whether command interfaces should be available in INACTIVE state. With this, we can set the hardware component to INACTIVE when the reverse interface is not connected. This way, the controller will be automatically stopped when the reverse interface disconnects. That means that the driver's command buffer will be reset with the latest value from the state interfaces when the reverse interface re-connects.
We've introduced a limit check in the client library that will avoid sudden motions on a much lower level. Hence closing this.