server icon indicating copy to clipboard operation
server copied to clipboard

[refactor]: Refactor Frontend Trace OpenTelemetry Implementation

Open oandreeva-nv opened this issue 1 year ago • 0 comments

What does the PR do?

This PR introduces an improvement to our frontend OpenTelemetry tracing implementation. The main goal of this refactoring is to enhance the reliability and flexibility of the system.

Key Changes:

  • Introduced a Stack-Based Approach
    • The previous implementation relied on specific span names to preserve the trace hierarchy.
    • Current solution uses stacks and doesn't need to know what type of span was started.
    • For ensemble and bls models, any number of sub-traces can be spawned. Since we preserve parent_id for every trace, I utilize this and introduced an unordered map to keep spans in stacks. Each trace and sub-trace has its own stack. When stack is empty, I use parent_id to check the top of the parent's stack to get parent span's information.
    • This refactor also helps the following Custom backend tracing implementation to avoid handling of the complex logic for multiple nested custom spans, i.e. we simply look at the top of the stack.

Checklist

  • [X] PR title reflects the change and is of format <commit_type>: <Title>
  • [X] Changes are described in the pull request.
  • [ ] Related issues are referenced.
  • [X] Populated github labels field
  • [ ] Added test plan and verified test passes.
  • [] Verified that the PR passes existing CI.
  • [X] Verified copyright is correct on all changed files.
  • [ ] Added succinct git squash message before merging ref.
  • [ ] All template sections are filled out.
  • [ ] Optional: Additional screenshots for behavior/output changes with before/after.

Commit Type:

Check the conventional commit type box here and add the label to the github PR.

  • [ ] build
  • [ ] ci
  • [ ] docs
  • [ ] feat
  • [ ] fix
  • [ ] perf
  • [X] refactor
  • [ ] revert
  • [ ] style
  • [ ] test

Related PRs:

N/A

Where should the reviewer start?

trace.h

Test plan:

  • CI Pipeline ID: 16139557

Caveats:

Background

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

N/A

oandreeva-nv avatar Jun 27 '24 22:06 oandreeva-nv