kineto icon indicating copy to clipboard operation
kineto copied to clipboard

Events incorrectly registered with category 'user_annotation' instead of 'Operator'

Open andreykramer opened this issue 3 years ago • 0 comments

I've been having a problem with the Step Time Breakdown plot, it was only showing a single step, 0, no matter how many steps I was actually profiling in the code. I could replicate running the resnet50_profiler_api example.

The profiler is expecting an event to be of type "operator" and have a name starting with "ProfilerStep#" (link) in order to parse it as a profiler step, but the events that are actually generated and exported for the steps in the JSON are of category "user_annotation":

    "ph": "X", "cat": "user_annotation", "name": "ProfilerStep#1", "pid": 726750, "tid": 726750,
    "ts": 1668698652781137, "dur": 120845,
    "args": {
      "Trace name": "PyTorch Profiler", "Trace iteration": 0,
      "External id": 4,
      "Profiler Event Index": 4977, "Call stack": "<built-in method _record_function_enter of PyCapsule object at 0x7f253af8d6c0>;torch/_ops.py(437): __call__;torch/autograd/profiler.py(487): __enter__;torch/profiler/profiler.py(500): step;resnet50_profiler_api.py(52): <module>;"
    }
  }

It seems to be also the cause of #610 , the events are expected to be of type "operator" (link) but again it registers as "user_annotation" in the JSON:

    "ph": "X", "cat": "user_annotation", "name": "enumerate(DataLoader)#_MultiProcessingDataLoaderIter.__next__", "pid": 385395, "tid": 385395,
    "ts": 1668685414940074, "dur": 73,
    "args": {
      "Trace name": "PyTorch Profiler", "Trace iteration": 0,
      "External id": 3400,
      "Profiler Event Index": 6837, "Call stack": "<built-in method _record_function_enter of PyCapsule object at 0x7f24209b96c0>;torch/_ops.py(437): __call__;torch/autograd/profiler.py(487): __enter__;torch/utils/data/dataloader.py(623): __next__;resnet50_profiler_api.py(52): <module>;"
    }
  }

When I replace all the occurences of "user_annotation" by "Operator" both issues seem to be solved in the tensorboard. Just FYI the other place I'm getting this type of category is for the Optimizer

    "ph": "X", "cat": "user_annotation", "name": "Optimizer.step#SGD.step", "pid": 726750, "tid": 726750,
    "ts": 1668698653009158, "dur": 6352,
    "args": {
      "Trace name": "PyTorch Profiler", "Trace iteration": 0,
      "External id": 4738,
      "Profiler Event Index": 8175, "Call stack": "<built-in method _record_function_enter of PyCapsule object at 0x7f253af8d6c0>;torch/_ops.py(437): __call__;torch/autograd/profiler.py(487): __enter__;torch/optim/optimizer.py(135): wrapper;resnet50_profiler_api.py(52): <module>;"
    }
  }

andreykramer avatar Nov 17 '22 15:11 andreykramer