Improve agent debug workflow

Open delano opened this issue 5 months ago • 1 comments

We are happy the way the help text is written. It is double duty for human users and agent users, explaining to agents some of the details of the tryouts test framework. But we would like to enhance the agent output by showing it a debug workflow without making the help text long and over detailed. So for example:

Run try --agent for a terse repo on the entire test suite. which includes the a summary of errors and failures with file paths and line numbers.
Run try --agent --agent-focus single specific/path/2/file_with_failing_testcase_try.rb:L100-150 to get maximum details for specific testcases.
cont'd

e.g. Add a new command line option to simplify agent debug workflow.

Current behaviour

  # Global health check (agents: ≤ 200 tokens)
  try --agent

  # Drill into a single failing file
  try --verbose --fails --stack path/to/file_try.rb

  # Narrow to a single test case (L = line number)
  try --verbose --fails --stack path/to/file_try.rb:L42

  # Watch a range of test cases
  try --verbose --fails --stack path/to/file_try.rb:L100-150

View --help output

HELP = <<~HELP

      Framework Defaults:
        Tryouts:    Shared context (state persists across tests)
        RSpec:      Fresh context (each test isolated)
        Minitest:   Fresh context (each test isolated)

      Examples:
        try test_try.rb                             # Tryouts test runner with shared context
        try --rspec test_try.rb                     # RSpec with fresh context
        try --direct --shared-context test_try.rb   # Explicit shared context
        try --generate-rspec test_try.rb            # Output RSpec code only
        try --inspect test_try.rb                   # Inspect file structure and validation
        try --agent test_try.rb                     # Agent-optimized structured output
        try --agent --agent-limit 10000 tests/      # Agent mode with 10K token limit

      Agent Output Modes:
        --agent                                     # Structured, token-efficient output
        --agent-focus summary                       # Show counts and problem files only (default)
        --agent-focus first-failure                 # Show first failure per file
        --agent-focus critical                      # Show errors/exceptions only
        --agent-limit 1000                          # Limit output to 1000 tokens

      File Naming & Organization:
        Files must end with '_try.rb' or '.try.rb' (e.g., auth_service_try.rb, user_model.try.rb)
        Auto-discovery searches: ./try/, ./tryouts/, ./*_try.rb, ./*.try.rb patterns
        Organize by feature/module: try/models/, try/services/, try/api/

      Testcase Structure (3 required parts)
        ## This is the description
        echo 'This is ruby code under test'
        true
        #=> true  # this is the expected result

      File Structure (3 sections):
        # Setup section (optional) - code before first testcase runs once before all tests
        @shared_var = "available to all test cases"

        ## TEST: Feature description
        # Test case body with plain Ruby code
        result = some_operation()
        #=> expected_value

        # Teardown section (optional) - code after last testcase runs once after all tests

      Execution Context:
        Shared Context (default): Instance variables persist across test cases
          - Use for: Integration testing, stateful scenarios, realistic workflows
          - Caution: Test order matters, state accumulates

        Fresh Context (--rspec/--minitest): Each test gets isolated environment
          - Use for: Unit testing, independent test cases
          - Setup variables copied to each test, but changes don't persist

      Writing Quality Tryouts:
        - Use realistic, plain Ruby code (avoid mocks, test harnesses)
        - Test descriptions start with ##, be specific about what's being tested
        - One result per test case (last expression is the result)
        - Use appropriate expectation types for clarity (#==> for boolean, #=:> for types)
        - Keep tests focused and readable - they serve as documentation

      Great Expectations System:
        #=>   Value equality        #==> Must be true         #=/=> Must be false
        #=|>  True OR false         #=!>  Must raise error    #=:>  Type matching
        #=~>  Regex matching        #=%>  Time constraints    #=*>  Non-nil result
        #=1>  STDOUT content        #=2>  STDERR content      #=<>  Intentional failure

      Exception Testing:
        # Method 1: Rescue and test exception
        begin
          risky_operation
        rescue StandardError => e
          e.class
        end
        #=> StandardError

        # Method 2: Let it raise and test with #=!>
        risky_operation
        #=!> error.is_a?(StandardError)
    HELP

Sep 01 '25 20:09 delano