node-sybase icon indicating copy to clipboard operation
node-sybase copied to clipboard

Support for Windows-1252/CP1252 database encoding - Java and Node.js encoding mismatch

Open renanwilliam opened this issue 4 months ago • 0 comments

🚨 Problem

When connecting to databases that use Windows-1252 encoding (common in legacy systems), special characters (accents, cedillas, etc.) are not handled correctly. The library currently forces UTF-8 encoding on the Java bridge, but doesn't provide proper configuration for databases using different character encodings.

🔍 Root Cause

The issue occurs because there's an encoding mismatch between:

  1. Java side: Currently hardcoded or defaults to UTF-8
  2. Node.js side: Uses the encoding parameter for both reading stdout and writing to stdin

When the database uses Windows-1252 encoding, the Java bridge should use Cp1252 to properly communicate with the database, while Node.js should use latin1 (the closest compatible encoding) for the IPC communication.

✅ Current Workaround

We successfully resolved this by patching the library to:

  1. Set Java encoding: "-Dfile.encoding=Cp1252"
  2. Configure Node.js encoding: latin1

The data flow becomes: Database (Windows-1252) ↔ Java Bridge (Cp1252) ↔ Node.js (latin1)

💡 Proposed Solution

The library could be enhanced to support separate encoding configurations:

new Sybase(host, port, dbname, username, password, logTiming, pathToJavaBridge, {
    encoding: 'latin1',           // Node.js encoding for IPC
    javaEncoding: 'Cp1252',      // Java encoding for database communication
    extraLogs: false
});

This would allow proper handling of different database encodings while maintaining backward compatibility.

🏢 Example Use Case

Legacy systems using Windows-1252 encoding (common in enterprise environments) where proper character encoding is critical for data integrity.

📁 Files Affected

  • src/SybaseDB.js: Lines around spawn() call and stdin.write()

renanwilliam avatar Sep 08 '25 19:09 renanwilliam