claude-code icon indicating copy to clipboard operation
claude-code copied to clipboard

[Bug] Edit tool corrupts Unicode typographic characters in non-ASCII text

Open Rafarel opened this issue 1 month ago • 0 comments

Bug Description Subject: Critical Issue - Claude Code Corrupts French Typography (Apostrophes) Even When Copy-Pasting

Problem: Claude Code systematically corrupts French typographic apostrophes ' (U+2019 RIGHT SINGLE QUOTATION MARK) by replacing them with straight apostrophes ' (U+0027 APOSTROPHE) when using the Edit tool. This happens even when explicitly copy-pasting text from the Read tool.

Critical Detail: The correct characters are already in the user's files. It's Claude Code that removes them and replaces them with the wrong characters during the Edit operation.

Impact:

  • Every single Edit operation on French code corrupts the typography
  • Makes code reviews extremely painful - developers must manually check every French string for corruption
  • Breaks professional French typography standards (U+2019 is the correct apostrophe in French)
  • Slows down development significantly - every commit requires manual verification and correction
  • Creates frustration and makes Claude Code unreliable for non-English codebases

Examples: "Tourner l’image vers la gauche" turn into "Tourner l'image vers la gauche" and that is not desirable.

Attempted Workaround - FAILED: We tried having Claude:

  1. Use Read tool to see the original text with correct apostrophes
  2. Copy-paste the exact text (including French strings) into Edit tool
  3. Only modify the code logic, not the French strings

Result: The workaround doesn't work. Even when Claude copy-pastes text from Read into Edit, the Edit tool call automatically converts ’ (U+2019) to ' (U+0027) before sending it to the system.

Root Cause: The Edit tool appears to normalize/sanitize text input, converting Unicode typographic characters to ASCII equivalents. This is destructive for languages that use proper typography.

Reproduction:

  1. Create a file with French text: altinaeTooltip="Tourner l'image vers la gauche" (note the ' is U+2019)
  2. Use Read tool - Claude sees the correct apostrophe
  3. Copy-paste that exact line into Edit tool to change nearby code: [disabled]="!loaded" → [disabled]="!loaded()"
  4. Result: The apostrophe in l'image is silently converted from U+2019 to U+0027

Suggested Solutions:

Option 1 (Preferred): Add a setting/flag to preserve Unicode characters in Edit operations:

  • preserveUnicode: true - Never normalize typographic characters
  • Should be opt-in or per-workspace configurable

Option 2: Detect quoted strings and preserve their exact Unicode characters:

  • Don't normalize text inside "..." or '...' strings
  • Only normalize code/whitespace

Option 3: Provide character-level diff information to user:

  • Show when Edit would change Unicode characters
  • Let user approve/reject character normalization

Affected Languages: This likely affects all languages with proper typography:

  • French: ' (apostrophe), «» (guillemets) Not tested :
  • Spanish: ¿¡
  • German: „" (quotes)
  • And many others

Current Status: We've had to document this as "Rule #1" in our codebase guidelines, but Claude Code cannot follow the rule due to this technical limitation. The workaround is impossible to implement.

Frequency: This happens on literally every Edit operation involving French text (dozens of times per day in active development).

Environment Info

  • Platform: darwin
  • Terminal: iTerm.app
  • Version: 2.0.76
  • Feedback ID:

Rafarel avatar Dec 31 '25 18:12 Rafarel