opsctrl_cli icon indicating copy to clipboard operation
opsctrl_cli copied to clipboard

Deep diagnose user story

Open orchide opened this issue 6 months ago • 0 comments

User Story

As a DevOps engineer troubleshooting a production issue,
I want to diagnose not just a failing pod but its entire application stack,
So that I can identify root causes that originate in dependent services rather than just see symptoms.

Acceptance Criteria:

  1. Given a pod that is part of a Helm-deployed application stack
    When I run the diagnose command with the stack analysis flag
    Then the tool should identify and analyze all related components in the deployment

  2. Given a webapp pod failing due to a database issue
    When I use this feature
    Then I should see the complete failure chain (e.g., Redis OOM → Connection refused → 500 errors)

  3. Given the additional analysis complexity
    When I run the command
    Then it should complete within 30 seconds

  4. Given a pod that isn't part of a Helm release
    When I use this flag
    Then it should gracefully fall back to single-pod diagnosis

Alternative Flag Names

--indepth is good, but here are some alternatives to consider:

  • --stack - Clear that it analyzes the entire stack
  • --full-stack - Even more explicit
  • --deep - Shorter version of indepth
  • --trace - Implies tracing through dependencies
  • --all - Simple, implies analyzing all related components
  • --deps / --dependencies - Explicit about following dependencies
  • --cascade - Captures the cascading failure analysis
  • --complete - Suggests comprehensive analysis

My preference would be --stack because:

  • It's short and memorable
  • Immediately conveys we're looking at the full application stack
  • Familiar term for developers
  • Clear differentiation from single-pod diagnosis

Usage would be: opsctrl diagnose webapp-pod-xxx --stack

orchide avatar Jul 13 '25 22:07 orchide