openraft icon indicating copy to clipboard operation
openraft copied to clipboard

Deprecate `RaftMetrics` watch channel for event logging through `RaftReporter` trait

Open semirix opened this issue 4 years ago • 5 comments

This is an update on an old suggestion outlined here.

The current problem is that RaftMetrics is being served through a tokio::sync::watch channel. This is not ideal because some essential raft events will be missed. Additionally, it is not possible to discern what event has actually occurred without comparing every single value in the RaftMetrics struct. It would be better to do event reporting through a trait object so users can decide how events are reported to their application and how they are stored. This trait object should be added to all the essential Raft structs that need to report events.


I have a proposal for how this could work.

Create a new RaftReporter trait, it would look something like the following:

pub trait RaftReporter {
    fn report(&self, event: RaftEvent);
}

RaftEvent would be a static enum of events that occur in the system. New variants can be added over time. It would look something like:

pub struct RaftEvent {
    NewLeader(u64),
    LatestLog(LogId),
    ...
}

A basic implementation of the trait might look like this:

pub struct Reporter {
    inbox: Arc<Mutex<tokio::sync::mpsc::Receiver<RaftEvent>>>,
    sender: tokio::sync::mpsc::Sender<RaftEvent>
}

impl RaftReporter for Reporter {
    fn report(&self, event: RaftEvent) {
        if let Err(error) = self.sender.send(event) {
            // Handle error
        }
    }
}

By default we could have a null reporter so if users don't care about event reporting it'll cost nothing. Alternatively, we could provide a basic default implementation so that users aren't deprived of a pre-existing feature, however I think most users will want to implement their own handler for events. The benefit of the null reporter is that reporting operations can be optimised away into a noop so if a user doesn't want reporting it won't affect performance.

The RaftMetrics struct can be implemented as a watch channel internally but it would make more sense to return it as a plain struct rather than through a Receiver<RaftMetrics> object in favour of this new system.

semirix avatar Apr 20 '22 00:04 semirix

👋 Thanks for opening this issue!

Get help or engage by:

  • /help : to print help messages.
  • /assignme : to assign this issue to you.

github-actions[bot] avatar Apr 20 '22 00:04 github-actions[bot]

A Reporter makes sense. Internally openraft is also driven by events. It's just a way to export them out.

In the new draft design, there are already some events(Command) defined: https://github.com/datafuselabs/openraft/blob/5fb62f435c8c4d7f73d29f980b80745121a4b00e/openraft/src/engine/engine.rs#L54-L74

pub struct Reporter {
    inbox: Arc<Mutex<tokio::sync::mpsc::Receiver<RaftEvent>>>,
    sender: tokio::sync::mpsc::Sender<RaftEvent>
}

impl RaftReporter for Reporter {
    fn report(&self, event: RaftEvent) {
        if let Err(error) = self.sender.send(event) {
            // Handle error
        }
    }
}

What I am not very sure about is, with this snippet, how do you expect a user to register it into openraft? Can you give a code snippet on this? :)

drmingdrmer avatar Apr 20 '22 05:04 drmingdrmer

Ah, I see I have missed an important part of the implementation. The idea is that this would be added to the top level Raft type like so:

pub(crate) type RaftNode = Raft<Request, Response, Router, Store, Reporter>;

I'm not too familiar with the internals of openraft but I imagine this object would be available to all pieces that need to report events.

semirix avatar Apr 20 '22 05:04 semirix

Ah, I see I have missed an important part of the implementation. The idea is that this would be added to the top level Raft type like so:

pub(crate) type RaftNode = Raft<Request, Response, Router, Store, Reporter>;

I'm not too familiar with the internals of openraft but I imagine this object would be available to all pieces that need to report events.

Looks feasible. :DD

Any suggestions? @lichuang @ariesdevil @schreter

drmingdrmer avatar Apr 20 '22 07:04 drmingdrmer

make sense:)

split metrics report into enum will make the granularity more precise.

lichuang avatar Apr 21 '22 02:04 lichuang