[Bug] Edit tool corrupts Unicode typographic characters in non-ASCII text
Bug Description Subject: Critical Issue - Claude Code Corrupts French Typography (Apostrophes) Even When Copy-Pasting
Problem: Claude Code systematically corrupts French typographic apostrophes ' (U+2019 RIGHT SINGLE QUOTATION MARK) by replacing them with straight apostrophes ' (U+0027 APOSTROPHE) when using the Edit tool. This happens even when explicitly copy-pasting text from the Read tool.
Critical Detail: The correct characters are already in the user's files. It's Claude Code that removes them and replaces them with the wrong characters during the Edit operation.
Impact:
- Every single Edit operation on French code corrupts the typography
- Makes code reviews extremely painful - developers must manually check every French string for corruption
- Breaks professional French typography standards (U+2019 is the correct apostrophe in French)
- Slows down development significantly - every commit requires manual verification and correction
- Creates frustration and makes Claude Code unreliable for non-English codebases
Examples: "Tourner l’image vers la gauche" turn into "Tourner l'image vers la gauche" and that is not desirable.
Attempted Workaround - FAILED: We tried having Claude:
- Use Read tool to see the original text with correct apostrophes
- Copy-paste the exact text (including French strings) into Edit tool
- Only modify the code logic, not the French strings
Result: The workaround doesn't work. Even when Claude copy-pastes text from Read into Edit, the Edit tool call automatically converts ’ (U+2019) to ' (U+0027) before sending it to the system.
Root Cause: The Edit tool appears to normalize/sanitize text input, converting Unicode typographic characters to ASCII equivalents. This is destructive for languages that use proper typography.
Reproduction:
- Create a file with French text: altinaeTooltip="Tourner l'image vers la gauche" (note the ' is U+2019)
- Use Read tool - Claude sees the correct apostrophe
- Copy-paste that exact line into Edit tool to change nearby code: [disabled]="!loaded" → [disabled]="!loaded()"
- Result: The apostrophe in l'image is silently converted from U+2019 to U+0027
Suggested Solutions:
Option 1 (Preferred): Add a setting/flag to preserve Unicode characters in Edit operations:
- preserveUnicode: true - Never normalize typographic characters
- Should be opt-in or per-workspace configurable
Option 2: Detect quoted strings and preserve their exact Unicode characters:
- Don't normalize text inside "..." or '...' strings
- Only normalize code/whitespace
Option 3: Provide character-level diff information to user:
- Show when Edit would change Unicode characters
- Let user approve/reject character normalization
Affected Languages: This likely affects all languages with proper typography:
- French: ' (apostrophe), «» (guillemets) Not tested :
- Spanish: ¿¡
- German: „" (quotes)
- And many others
Current Status: We've had to document this as "Rule #1" in our codebase guidelines, but Claude Code cannot follow the rule due to this technical limitation. The workaround is impossible to implement.
Frequency: This happens on literally every Edit operation involving French text (dozens of times per day in active development).
Environment Info
- Platform: darwin
- Terminal: iTerm.app
- Version: 2.0.76
- Feedback ID: