Add a `nickname` field to `peerinfo` protocol
Problem to be solved
We have a hard time giving useful names to peers in a cluster. It would be great if we could give people an easier way to identify their node places. We could have people label their prometheus writes, but that doesn't get shared with peers, so might help us more than it helps users, which might not be what we are after. Instead, what if we included an optional field in the peerinfo protocol, or another protocol that allows us to share short strings with one another along with peerId/nodeId information. That way, we could (hopefully safely), use it in logs and monitoring on the connected person's side, to improve context and understanding of issues by 'making them human'. Instead of
Peer Disconnected: [ just-engine, automatic-lily ]
Maybe it could be:
Peer Disconnected: [ "xenowits", "Corver" ]
Proposed solution
Add a optional --nickname string flag to charon run (and optionally charon dkg)
Share that with peers via peerinfo protocol
Add app_peerinfo_nickname constant metric with nickname label (including own)
(And even lazier MVP is adding the flag and just creating the prom metric for it, that would mean we could show it in obol's shared grafana but not the charon peer logs)
In Scope
Add a field to the grafana dashboard for these nicknames. Update the dashboard in charon and cdvn, no need for cdvc.
Asserting the maximum string is enforced at CLI and wire, and that it is safe from weird attacks such as terminal takeovers and stuff.
Out of Scope
Consensus on these nicknames across parties
Why don't we add it to the lock file rather? Make it part of the protocol properly. CLI flags is very loose, people will change it over time and joke around. If in the closter lock, it could be a nice initial migration to add.: updating your node moniker/nickname/name.
Also wiring the result from peerinfo throughout the codebase is going to be hard to implement. It would require a big refactor switching from immutable types to mutable types which is a pain and something I want to avoid.
100 chars is also too long imo, it doesn't display nicely in grafana and people will write sentences. Suggest 32 chars.
another term could also be nickname, feels a bit simpler and more approachable to non-english speakers.
I'd also suggest we not try to "replace" existing cluster_peer names, as that will just result in a mess in grafana as timeseries chop and change, deduplicating is impossible I think. Suggest we only include the new nicks/monikers in the Peer panel, but keep on using deterministic peer names for everything else.
Note my suggested solution:
- Add a optional
--nicknamestring flag - Share that with peers via
peerinfoprotocol - Add
app_peerinfo_nicknameconstant metric with nickname label (including own)
Note that adding this to the lock file is still the best solution as then it is also supported by DKG, see #1912. Requiring users to add consistent moniker fields to DKG introduces risk of users needs to modify the copied launchpad command which will result in more DKG failures. Best just copy and paste that command directly.