DDS Router invalid when the network is bad(or disconnected)
Hello, I am using DDSRouter recently, but I encountered a problem. I configured dds router on both my server and client, but when the network condition is very bad(or disconnected), sometimes ddsrouter will stop routing valid data. At this time, restart dds router on the client side, and the data can be transmitted again. I would like to ask what is the reason for this problem?
Hi @luzw5
We will need to know more details to help you. What do you mean by "stop routing valid data"? What are the YAML configurations you are using? Is this connection through WAN or in a LAN?
Best regards
Thank you for your reply, I will provide you with more information.
Robot1
version: v3.0
builtin-topics:
- name: robot_state
type: FreeFleetData::RobotState
- name: task_result
type: TaskResult::ResultData
- name: mode_request
type: FreeFleetData::ModeRequest
- name: path_request
type: FreeFleetData::PathRequest
- name: destination_request
type: FreeFleetData::DestinationRequest
- name: photo_task_request
type: PhotoTask::PhotoTaskRequest
- name: task_request
type: TaskRequest::RequestData
participants:
- name: SimpleParticipant
kind: local
domain: 44
- name: echo_participantD1
kind: echo
data: true
verbose: true
discovery: true
- name: WanParticipantD1
kind: wan
connection-addresses:
- domain: server
port: 9010
transport: tcp
Robot 2
version: v3.0
builtin-topics:
- name: robot_state
type: FreeFleetData::RobotState
- name: task_result
type: TaskResult::ResultData
- name: mode_request
type: FreeFleetData::ModeRequest
- name: path_request
type: FreeFleetData::PathRequest
- name: destination_request
type: FreeFleetData::DestinationRequest
- name: photo_task_request
type: PhotoTask::PhotoTaskRequest
- name: task_request
type: TaskRequest::RequestData
participants:
- name: SimpleParticipant
kind: local
domain: 42
- name: echo_participantD1
kind: echo
data: true
verbose: true
discovery: true
- name: WanParticipantD1
kind: wan
connection-addresses:
- domain: server
port: 9017
transport: tcp
Server
version: v3.0
builtin-topics:
- name: robot_state
type: FreeFleetData::RobotState
- name: task_result
type: TaskResult::ResultData
- name: mode_request
type: FreeFleetData::ModeRequest
- name: path_request
type: FreeFleetData::PathRequest
- name: destination_request
type: FreeFleetData::DestinationRequest
- name: photo_task_request
type: PhotoTask::PhotoTaskRequest
- name: task_request
type: TaskRequest::RequestData
participants:
- name: ParticipantServer
kind: local
domain: 4
- name: EchoParticipantServer
kind: echo
discovery: true
data: true
verbose: true
- name: WanParticipantServerD1
kind: wan
listening-addresses:
- ip: 10.164.24.239
port: 9010
transport: tcp
- name: WanParticipantServerD2
kind: wan
listening-addresses:
- ip: 10.164.24.239
port: 9017
transport: tcp
The specific phenomenon is that when the network is not good, the DDS data sent by the Server can no longer be transmitted to Robot1 or Robot2. However, when I restart the docker container of Robot1 or Robot2, that is, re-run the ddsrouter program, it works normally again. Sorry, I'm not sure what's causing it. Could you please provide me with some debugging methods?
Thank you! A couple more questions and suggestions for you to try:
- By looking at your diagram, I interpret that you have 3 different machines (2 robots and a server), and a docker container is run in each of them. Then you are using host's network stack (--net=host) to communicate within your LAN (or WAN? That IP address looks public). Am I right?
- This looks like a transport issue to me. UDP transport usually outperforms TCP, and we always recommend to use it whenever that's an option. I believe that's the case for your scenario if I'm not mistaken, could you please try with UDP?
- If you wish to communicate both clients through the server, it would be enough to have a WAN repeater participant in the server instead of two. Please have a look at Repeater DDS Router for more information.
I think I can try to modify qos and udp now to solve my current problem. In addition, if I don't want the two robots to communicate directly, is my current configuration valid?
With your configuration the two robots should not communicate directly, but through the server. What I meant is that there is no need of having two WAN participants in the server's router, and connect each of them with a different robot. You can connect the two robots to a WAN repeater participant in the server, and both will still be able to communicate (indirectly, through the server).
If you wish not to have any communication between the robots, then you would also need to connect both to the same WAN participant in the server's router, but this time configured with repeater: false.