cai icon indicating copy to clipboard operation
cai copied to clipboard

Enhance error handling for OutputGuardrailTripwireTriggered exception…

Open chrissRo opened this issue 3 months ago • 2 comments

When CAI generates reports at the end of an assessment with CAI_STREAM=true, instead of displaying the report cleanly, the terminal output shows corrupted behavior:

  • The entire report reprints multiple times
  • Each reprint adds one new line to the previous output
  • Results in exponentially growing output with duplicate content
  • Terminal fills with repeated copies of the same report content

My assumption is that this is caused by output guardraisl detecting dangerous content (like exploit code or sqli-payloads) and raises an OutputGuardrailTripwireTriggered exception. But this exception is not handled properly by the streaming-system. Therefore the live display context remains active and is not cleaned up properly

I added this piece of code in cli.py at line 1511 as a workaround / solution

except OutputGuardrailTripwireTriggered as e:
                            # Handle guardrail exception specifically - MUST come before broad Exception handler
                            # Clean up streaming display before showing error
                            try:
                                from cai.util import cleanup_all_streaming_resources
                                cleanup_all_streaming_resources()
                            except Exception:
                                pass
                            
                            # Clean up the async generator
                            if stream_iterator is not None:
                                try:
                                    await stream_iterator.aclose()
                                except Exception:
                                    pass
                            
                            # Clean up the result object if it has cleanup methods
                            if result is not None and hasattr(result, '_cleanup_tasks'):
                                try:
                                    result._cleanup_tasks()
                                except Exception:
                                    pass
                            
                            # Re-raise to be caught by outer handler which shows user-friendly message
                            raise

this stops the stream to break but leads to the output of the exception-message found in cli.py at around line 1598

except OutputGuardrailTripwireTriggered as e:
                        # Display a user-friendly warning instead of crashing (streaming mode)
                        guardrail_name = e.guardrail_result.guardrail.get_name()
                        reason = e.guardrail_result.output.output_info.get("reason", "Security policy violation")
                        
                        # Use red color for the warning message
                        print(f"\n\033[91m🛡️  SECURITY GUARDRAIL TRIGGERED\033[0m")
                        print(f"\033[91mGuardrail: {guardrail_name}\033[0m")
                        print(f"\033[91mReason: {reason}\033[0m")
                        print(f"\033[93mThe agent's output was blocked for security reasons.\033[0m")
                        print(f"\033[96mYou can continue the conversation with a different request.\033[0m\n")
                        
                        # Continue the conversation loop instead of crashing
                        continue

indicating that the guardrail is indeed triggered by the stream (but not blocking the stream)

chrissRo avatar Nov 11 '25 10:11 chrissRo

ping @aliasrobotics-support

vmayoral avatar Nov 14 '25 11:11 vmayoral

We'll look at it as soon as we can.

aliasrobotics-support avatar Nov 17 '25 08:11 aliasrobotics-support