Deep diagnose user story

Open orchide opened this issue 6 months ago • 0 comments

User Story

As a DevOps engineer troubleshooting a production issue,
I want to diagnose not just a failing pod but its entire application stack,
So that I can identify root causes that originate in dependent services rather than just see symptoms.

Acceptance Criteria:

Given a pod that is part of a Helm-deployed application stack
When I run the diagnose command with the stack analysis flag
Then the tool should identify and analyze all related components in the deployment
Given a webapp pod failing due to a database issue
When I use this feature
Then I should see the complete failure chain (e.g., Redis OOM → Connection refused → 500 errors)
Given the additional analysis complexity
When I run the command
Then it should complete within 30 seconds
Given a pod that isn't part of a Helm release
When I use this flag
Then it should gracefully fall back to single-pod diagnosis

Alternative Flag Names

--indepth is good, but here are some alternatives to consider:

--stack - Clear that it analyzes the entire stack
--full-stack - Even more explicit
--deep - Shorter version of indepth
--trace - Implies tracing through dependencies
--all - Simple, implies analyzing all related components
--deps / --dependencies - Explicit about following dependencies
--cascade - Captures the cascading failure analysis
--complete - Suggests comprehensive analysis

My preference would be --stack because:

It's short and memorable
Immediately conveys we're looking at the full application stack
Familiar term for developers
Clear differentiation from single-pod diagnosis

Usage would be: opsctrl diagnose webapp-pod-xxx --stack

Jul 13 '25 22:07 orchide