Replace `ConnectedPoint` with `Dialer` in `libp2p_swarm::DialError::WrongPeerId`
Description
Change the endpoint type to Dialer, extracting it from ConnectedPoint.
Motivation
Currently in case of DialError with WrongPeerId the ConnectedPoint is returned, which can contain Listener, which can't happen, because we are the dialing side. Replacing ConnectedPoint with Dialer can help to resolve this ambiguity and eliminate client code assertions.
ConnectedPoint endpoint type in WrongPeerId was introduced in this PR, which was merged here.
The original discussion that lead to this issue is in the substrate PR that changes the error printed when we connect to a bootnode that provides a different than expected peer id.
Are you planning to do it yourself in a pull request?
~~Maybe.~~ Yes.
That sounds reasonable to me. Thanks @dmitry-markin for the detailed report.
I am not going to work on this any time soon. I am happy to help you implement it though. Given your "maybe", are you still interested?
Yes, I think I can look into it in a week or so. The one thing I worry about is that these changes will definitely break the API, and I don't know how to handle it. I.e, I can update substrate, but we need to somehow notify other clients.
Given that this is not a subtle change, but a change at the type system, I think users will (a) notice early and (b) can consult the changelog to get more advice. In other words, I am not worried about the breaking API change @dmitry-markin.
It seems the cause for DialError::WrongPeerId to contain ConnectedPoint, and not Dialer or just Multiaddr is that PendingOutboundConnectionError is based on PendingConnectionError also used for PendingInboundConnectionError. So they both share ConnectedPoint internally, even though ConnectedPoint::Listener is not relevant for PendingOutboundConnectionError/DialError.
To fix the API issue we can drop the Listener branch when converting PendingOutboundConnectionError into DialError here.
~~I see two ways of doing this. First is to extract ConnectedDialer, PendingDialer, and Listener (listeners are the same in both enums) from ConnectedPoint & PendingPoint. Then replace ConnectedPoint with ConnectedDialer in DialError. This requires a lot of modifications throughout the codebase and significantly changes the API.~~ EDIT: I don't think this makes sense :-)
Another option is to keep ConnectedPoint and PendingPoint enums the same, and just modify DialError to include something instead ConnectedPoint. This can be either independent new ConnectedDialer struct, or even a plain Multiaddr.
I would go with the second option, but I don't understand whether we need to report the role_override when reporting the DialError (i.e., use the ConnectedDialer struct), or reporting a Multiaddr will be enough.
@mxinden what do you think?
I've implemented the simplest solution in https://github.com/libp2p/rust-libp2p/pull/2793.
It seems the cause for
DialError::WrongPeerIdto containConnectedPoint, and notDialeror justMultiaddris thatPendingOutboundConnectionErroris based onPendingConnectionErroralso used forPendingInboundConnectionError. So they both shareConnectedPointinternally, even thoughConnectedPoint::Listeneris not relevant forPendingOutboundConnectionError/DialError.
I would be in favor of doing as much as possible at compile time. Would splitting PendingConnectionError be an option?
I would be in favor of doing as much as possible at compile time. Would splitting PendingConnectionError be an option?
Yes, this seems reasonable, I'll check it.