MLFlowClient.jl icon indicating copy to clipboard operation
MLFlowClient.jl copied to clipboard

Proposal to buffer service requests

Open ablaom opened this issue 1 year ago • 0 comments

The context of this proposal is this synchronisation issue.

The main problem with logging in parallelized operations is simply this: requests are posted directly to an MLflow service without full information about the state the service at the time the request is ultimately acted on. I propose we resolve this as follows:

  • Instead of a client posting requests directly to an MLflow service, they are posted (put!) to a first-in-first-out queue (Julia Channel). Requesting calls will return immediately, unless the queue is full. In this way, the performance of the parallel workload is not impacted.

  • A single Julia Task dispatches requests (take!s) from the end of the queue. Whenever a request has the possibility of altering the service state (e.g., creating an experiment), then the dispatcher waits for confirmation that the state change is complete before dispatching the next request.

I imagine that we can insert the queue (buffer) without breaking the user-facing interface of MLFlowClient.jl.

I have implemented a POC for this proposal and shared it with two maintainers, and can share with anyone else interested.

ablaom avatar Mar 06 '24 21:03 ablaom