Brian Coutinho issues

Results 8 issues of


                                            Brian Coutinho

add simple test for nccl metadata

Add a few test cases to verify newly added NCCL metadata in profiler events The test looks at the following blocks record_param_comms ``` { "ph": "X", "cat": "cpu_op", "name": "record_param_comms",...

oncall: distributed

fb-exported

with-ssh

Support memory profiling feature from on-demand path

## Summary Currently the memory profiler feature in PyTorch is available via the [profiler API](https://pytorch.org/tutorials/recipes/recipes/profiler_recipe.html#using-profiler-to-analyze-memory-consumption) by passing `profile_memory=True` in the interface. It is desirable to also enable memory profiling using...

enhancement

Make ipcfabric have a single definition

Summary: There is an ODR one definition rule violation that was causing a crash on sigrid https://fb.workplace.com/groups/560979627394613/posts/2909061125919773/?comment_id=2909752389183980&reply_comment_id=2910102119149007 Sigrid includes both kineto and ipcfabric via dynolog, and on kineto the class...

CLA Signed

add compile time flags in prints

Related to an issue we saw on FAIR research cluster, some of the compille time flags were not set as expected. This change prints them so we can easily debug...

CLA Signed

Prometheus Support for Metrics logging.

## TLDR Dynolog provides system telemetry at Meta as well as in open source environments. Metric logging using Prometheus - an industry standard framework for logging/exporting metrics. This can also...

[profiler] enable CUPTI range profiler in build

Fixes #125272 ## About (This is a re-spin of PR #106617) Kineto introduced a new profiler to read performance counters from NVIDIA GPUs (CUPTI Range Profiler API) added in PR[75616](https://github.com/pytorch/pytorch/pull/75616)....

Merged

Reverted

ciflow/binaries

ciflow/trunk

topic: not user facing

ciflow/binaries_conda

ciflow/binaries_wheel

ciflow/binaries_libtorch

[1/n] Add generalized event types and GPU Performance Monitoring counter event support

# Overview Originally, this was part 1 of splitting PR #1148. It supports a new kind of GPU Counter events that will be published to the timeline as a time...

cla signed

[Feature] Enabling dynamic profiler plugins

# Enabling dynamic profiler plugins (Feature Proposal) Authors: @yisitu, @zli669, @briancoutinho (NVIDIA) ## TLDR * We would like to contribute a feature to Kineto to enable the plugging in of...