cosmo icon indicating copy to clipboard operation
cosmo copied to clipboard

fix(router): use proper OTEL span kind for `Authenticate` middleware step

Open Blacksmoke16 opened this issue 4 months ago • 5 comments

Summary by CodeRabbit

  • Chores
    • Updated tracing metadata to classify authentication-related spans as client-side, improving attribution and visibility in observability tools.
    • No changes to user-facing behavior, workflows, or APIs.
    • Error handling and pre-handler flow remain unchanged.
    • No configuration updates or actions required from users.

Checklist

  • [x] I have discussed my proposed changes in an issue and have received approval to proceed.
  • [x] I have followed the coding standards of the project.
  • [x] I have read the Contributors Guide.

Follow up to https://github.com/wundergraph/cosmo/pull/456#discussion_r2389253235. It was unexpected for the Authenticate middleware step to have a span kind of server due to this step not representing an incoming remote call. In fact since authentication results in an outgoing HTTP request, it should in fact be client. It cannot be internal because it's crossing a process boundary (the HTTP request).

Blacksmoke16 avatar Sep 30 '25 04:09 Blacksmoke16

Walkthrough

Changed the tracing span kind for the Authenticate operation from SpanKindServer to SpanKindClient in the GraphQL pre-handler and updated tests to expect the client span kind; no API signatures or control flow altered.

Changes

Cohort / File(s) Summary of Changes
Tracing span kind update
router/core/graphql_prehandler.go
Changed Authenticate tracing span kind from trace.SpanKindServer to trace.SpanKindClient; no other logic or control-flow changes.
Test expectation update
router-tests/telemetry/telemetry_test.go
Updated telemetry tests to assert that an Authenticate span, when present, has span kind trace.SpanKindClient (retains existing assertions about claim attributes).

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

  • Verify span parent/child relationships and that the new SpanKindClient is appropriate for outgoing HTTP/remote-call semantics.
  • Confirm tests assert the span kind consistently and that no other test expectations implicitly depended on the previous SpanKindServer.

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically describes the main change: correcting the OpenTelemetry span kind for the Authenticate middleware from server to client.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • [ ] 📝 Generate docstrings

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1771fbc99f11613a729eb574f95666af68e8c50b and 43d4e14518555a389edb9ad668e4599169774562.

📒 Files selected for processing (2)
  • router-tests/telemetry/telemetry_test.go (1 hunks)
  • router/core/graphql_prehandler.go (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • router/core/graphql_prehandler.go
  • router-tests/telemetry/telemetry_test.go
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
  • GitHub Check: integration_test (./events)
  • GitHub Check: integration_test (./telemetry)
  • GitHub Check: integration_test (./. ./fuzzquery ./lifecycle ./modules)
  • GitHub Check: build_image_fork
  • GitHub Check: build_test_fork
  • GitHub Check: build_image_fork (nonroot)
  • GitHub Check: build_test
  • GitHub Check: Analyze (go)

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

coderabbitai[bot] avatar Sep 30 '25 04:09 coderabbitai[bot]

This PR was marked stale due to lack of activity. It will be closed in 14 days.

github-actions[bot] avatar Oct 18 '25 05:10 github-actions[bot]

Not stale, just pending a review :slightly_smiling_face:

Blacksmoke16 avatar Oct 18 '25 14:10 Blacksmoke16

This PR was marked stale due to lack of activity. It will be closed in 14 days.

github-actions[bot] avatar Nov 02 '25 05:11 github-actions[bot]

@SkArchon bump

Blacksmoke16 avatar Nov 02 '25 13:11 Blacksmoke16

This PR was marked stale due to lack of activity. It will be closed in 14 days.

github-actions[bot] avatar Nov 17 '25 05:11 github-actions[bot]

@Blacksmoke16 Hi, sorry for the delay.

I think at minimum we will need some tests to prove that we don't make assumptions on the kind in the control plane.

Could you add some relevant tests, please?

The WunderGraph Team

CC: @SkArchon @miklosbarabas @StarpTech

Aenimus avatar Nov 17 '25 17:11 Aenimus

@Aenimus I guess one follow-up question: how would you actually write a test for that? If there was a prior assumption I'd have expected some existing test(s) to fail. I can of course write a test that asserts this span's kind is what we expect and that it doesn't affect root span detection or anything tho.

The UI just exposes whatever the kind is cosmetically, and ClickHouse views are based on attributes so no change in behavior there either.

Blacksmoke16 avatar Nov 17 '25 18:11 Blacksmoke16

Hi @Blacksmoke16, please look into https://github.com/wundergraph/cosmo/blob/d6a443e448470dac7391a4e68210d0fcdd95741d/router-tests/telemetry/telemetry_test.go#L9231. There, we test authentication and verify telemetry; however, we never checked for the Span kind. You can choose add the assertion there.

StarpTech avatar Nov 26 '25 12:11 StarpTech

Codecov Report

:x: Patch coverage is 0% with 1 line in your changes missing coverage. Please review. :warning: Please upload report for BASE (main@441f229). Learn more about missing BASE report.

Files with missing lines Patch % Lines
router/core/graphql_prehandler.go 0.00% 1 Missing :warning:

:x: Your patch check has failed because the patch coverage (0.00%) is below the target coverage (90.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #2247   +/-   ##
=======================================
  Coverage        ?   24.06%           
=======================================
  Files           ?      213           
  Lines           ?    23242           
  Branches        ?        0           
=======================================
  Hits            ?     5594           
  Misses          ?    16902           
  Partials        ?      746           

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

:rocket: New features to boost your workflow:
  • :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • :package: JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

codecov[bot] avatar Dec 05 '25 00:12 codecov[bot]