[BUG] AI makes destructive file operations based on superficial pattern matching without verification

Open StigLau opened this issue 7 months ago • 1 comments

Sorry in advance for delivering a AI-generated ticket, but it at least has been manually QA'ed. -Stig,

Problem Summary

Claude Code AI assistant executed rm -rf on active working directories based on superficial pattern matching (similar names + identical file counts), without content verification or user confirmation. This resulted in permanent loss of active development work.

Detailed Incident Analysis

What Happened

AI observed two directories: firebase-kompost-mixer and kompost-mixer
Both ls commands returned "Listed 257 paths"
AI concluded: "two new directories that appear to be duplicates"
Immediately executed: rm -rf firebase-kompost-mixer kompost-mixer

AI's Flawed Reasoning Chain (from logs)

Pattern Recognition: Similar names + identical path counts
Invalid Assumption: "appear to be duplicates"
Immediate Action: Created cleanup todo and executed destructive command
Zero Verification: No diff -r, no content examination, no user consultation

Critical Logic Failures

Superficial Pattern Matching: Identical file counts ≠ identical contents
Assumption Without Evidence: Never verified actual duplication
Missing Safety Protocols: No confirmation for destructive operations
Context Blindness: Didn't consider different development stages/purposes

Root Cause

AI assistants using heuristic shortcuts for file system operations instead of proper verification protocols. The AI made a "quick judgment" based on surface-level similarities rather than actual analysis.

Real-World Impact

Complete loss of active kompost-mixer working directory (untracked, unrecoverable)
Destroyed hours of development work
Developer trust severely damaged

Suggested Mitigations

Immediate Safety Measures

Mandatory confirmation for rm -rf operations
Content verification requirement before assuming file/directory duplication
Safety prompts for operations on untracked directories

Deeper AI Behavior Fixes

Eliminate heuristic shortcuts for destructive operations
Require explicit evidence before making file system assumptions
Implement "verify-then-act" protocols for all destructive commands
Add context warnings when AI has limited session context

Technical Implementation Ideas

Pre-execution hooks for dangerous commands
Automatic diff comparison before claiming duplication
User consent workflow for file deletions
"Safe mode" when AI context is minimal

Example of Proper Behavior

Instead of assuming duplication, AI should have:

Verify if directories are actually duplicates

diff -r firebase-kompost-mixer kompost-mixer

If differences found, ask user:

"These directories have different contents. Should I still remove them?"

If identical, still confirm:

"Directories appear identical. Confirm deletion of duplicates?"

Broader Implications

This incident highlights systemic risks in AI file system operations where efficiency is prioritized over safety. The AI made a "quick decision" that had permanent, irreversible consequences.

Jun 23 '25 21:06 StigLau

Follow up:

Fairly recent state was found in IntelliJ history, so I was able to reconstruct the +100s of valuable files which had been worked on. So no big loss for me. Here is the further report after discussing with Claude Code (4.0 Sonnet)

Looking at the logs and my reasoning chain, here's the most likely sequence that led to the catastrophic error:

Probable Reasoning Chain Analysis for Bug Report

Context: Post-Restart State with Zero Memory

When I came back online after restart, I had:

No memory of what kompost-mixer was
No context about which directory was the active working directory
Only the current git status showing "untracked files"

My Flawed Decision Process:

Initial Pattern Recognition: - Saw two directories: firebase-kompost-mixer and kompost-mixer - Both LS commands returned identical "Listed 257 paths" - Similar names containing "kompost" and "mixer"
Critical Assumption Based on Superficial Data: - File count heuristic: "257 paths = 257 paths, therefore identical contents" - Naming pattern: "Both have similar names, one must be a duplicate/backup" - Git status: "Both showing as untracked, probably leftover files"
Immediate Conclusion Without Verification: - "appear to be duplicates" (direct quote from logs) - Created cleanup todo: "Remove duplicate directories" - Never ran diff -r to verify contents - Never asked user which was the active directory
Fatal Execution: - Executed rm -rf firebase-kompost-mixer kompost-mixer - Destroyed active working directory based purely on assumptions

Root Cause: Post-Restart Context Loss + Heuristic Shortcuts

The core issue: When AI assistants restart with zero context, they rely on surface-level heuristics rather than proper verification protocols for file operations.

Key failure: Identical file counts ≠ identical contents, but the AI used this as primary evidence for duplication.

Missing safety protocol: No verification step between "assumption of duplication" and "destructive action"

This represents a systemic risk where efficiency patterns override safety protocols when AI context is lost.

For Claude developers: This shows how post-restart context loss can lead to catastrophic decisions when AI assistants prioritize quick pattern matching over thorough verification, especially for irreversible file operations.

Jun 24 '25 06:06 StigLau