Manager TODOs
- [x] Spawn a background task when creating a new session, returning the task ID to the client after max_wait seconds.
- [x] Let the background task relay the kernel_pull_progress events.

About Background-Task

The bgtask contains two asynchronous functions, one of them selects kernel's status from 'kernels' table, the other updates progress. ~To ensure synchronization, I use Asyncio-lock.~ Update_progress function updates progress status, reacting to KernelPullprogress Event. And, Progress Reporter is updated by the function. So, manager sends the progress-reporter to client, and, client displays kernel-pull-progress using the reporter.

Sep 01 '21 10:09 youngjun0627

Codecov Report

Merging #477 (60bfc69) into main (15acec7) will not change coverage. The diff coverage is n/a.

:exclamation: Current head 60bfc69 differs from pull request most recent head e2f01f4. Consider uploading reports for the commit e2f01f4 to get more accurate results

@@           Coverage Diff           @@
##             main     #477   +/-   ##
=======================================
  Coverage   48.89%   48.89%           
=======================================
  Files          54       54           
  Lines        8908     8908           
=======================================
  Hits         4356     4356           
  Misses       4552     4552

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 15acec7...e2f01f4. Read the comment docs.

Sep 08 '21 06:09 codecov[bot]

I have finished to fix the code reflected your feedback. And, following images show difference between when max_wait is 0 or not.

Value of max-wait is equal 0
Set max-wait seconds

Also, the code in client-py is fixed. Formally, I don't check 'background_task'. After fixed, I check 'background_task'. If true, kernel-pull-progress is displayed to client during pulling(downloading image). If not, client can't see the progress and wait it.

Sep 09 '21 07:09 youngjun0627

I have finished to fix the code reflected your feedback. And, following images show difference between when max_wait is 0 or not.

Value of max-wait is equal 0

Set max-wait seconds

Also, the code in client-py is fixed. Formally, I don't check 'background_task'. After fixed, I check 'background_task'. If true, kernel-pull-progress is displayed to client during pulling(downloading image). If not, client can't see the progress and wait it.

Just a note: Formally -> Formerly / Previously This made me confused for a moment. ;)

Sep 09 '21 07:09 achimnol

I have no idea how to send some data(e.g, kernel_id, root_ctx, ...) to this as separate function . So, I change this to callable class and separate this from '_create' function.

Oct 23 '21 08:10 youngjun0627

Callable class is generally considered an anti-pattern unless you are writing a kind of low-level library. Just pass any required information as additional arguments.

Oct 31 '21 03:10 achimnol

To handle multi-node scenarios (cluster sessions) where multiple agents generate the same KernelPullProgressEvent with their own progresses, I think there should be a logic to aggregate them in the session-level and calculate the "total" progress of the multiple agents. Maybe this could be handled in the client-side, by observing multiple bgtasks concurrently.

Oct 31 '21 03:10 achimnol

To handle multi-node scenarios (cluster sessions) where multiple agents generate the same KernelPullProgressEvent with their own progresses, I think there should be a logic to aggregate them in the session-level and calculate the "total" progress of the multiple agents. Maybe this could be handled in the client-side, by observing multiple bgtasks concurrently.

To implement this in the client-side, we need to:

manager: Return the multiple kernel IDs and bgtask IDs for individual kernels from api.session._create()
client-py: Open multiple SSE connections to the bgtask IDs and show the per-kernel progress like multi-layer progress view in Docker CLI's pull command implementation, or aggregate the per-kernel progresses into a single progress bar.

To implement this in the manager-side:

manager: Return the single bgtask ID just like now.
manager: In the bgtask, aggregate the multiple kernel pull progress events that belongs to the current session (which may have multiple kernels) and return the aggregated result.

Personally, I think it's better to implement in the client side, as depending on the client-side implementation, we would have runtime options whether to expose more per-kernel details or show the overall progress only.

Oct 31 '21 05:10 achimnol

Sorry for the late reply. I have a question.

command 1 : backend.ai run --cluster-mode multi-node --cluster-size 2 'print("hello")' python
command 2 : backend.ai run -c 'import os; print("Hello world, {}".format(os.environ["CASENO"]))' -r cpu=1 -r mem=320m -e 'CASENO=$X' --env-range=X=case:1,2 python Which one does multi-node scenarios (cluster sessions) mean, command 1 or command 2?

And, if we implement the multi scenario in the client-side, is there no need to fix it anymore?

Nov 27 '21 11:11 youngjun0627

feat: Add background-task for kernel-pull-progress

About Background-Task

Codecov Report