Augusto Yao issues

Results 4 issues of


                                            Augusto Yao

How to add customized metadata with on demand profiling ?

When profiling with `torch.profiler.profile` , generated json file has a section called `distributedInfo` shown as below ```json { "distributedInfo": {"backend": "nccl", "rank": 0, "world_size": 2} } ``` But there's no...

bug

make CollectTrace for profiling by iteration async

This fix issue #953. Makes `libkineto::api().client()->stop()` and `stopTraceInternal` run in `profilerThread_` so that, the training process will not be blocked.

cla signed

Train process is blocked when kineto is processing traceEvents

When using on-demand profiling via `dynolog` and `kineto`, we noticed that, when profiling request configured with iterations, the last profiling iteration took more time than other profiling iterations. The train...

bug

Needs help, how to write trace files to remote storage

Recently, we have deployed dynolog in our gpu cluster to collect trace files via kineto on-demand profiling. It needs extra efforts to collect trace files dumped to local storage via...