fix: fix Chinese character escaping issue
Add the ensure_ascii=False parameter to all json.dumps calls to ensure non-ASCII characters (such as Chinese) are not escaped into \uXXXX format.
Files to modify:
langfuse/_client/attributes.py: Core serialization functions langfuse/_utils/request.py: API request data serialization langfuse/_client/utils.py: Debug output formatting langfuse/_task_manager/score_ingestion_consumer.py: Score processing serialization
Before Fix & After Fix:
[!IMPORTANT] Add
ensure_ascii=Falsetojson.dumpscalls to prevent non-ASCII character escaping in multiple files.
- Behavior:
- Add
ensure_ascii=Falsetojson.dumpsinattributes.pyfor core serialization functions.- Add
ensure_ascii=Falsetojson.dumpsinrequest.pyfor API request data serialization.- Add
ensure_ascii=Falsetojson.dumpsinutils.pyfor debug output formatting.- Add
ensure_ascii=Falsetojson.dumpsinscore_ingestion_consumer.pyfor score processing serialization.This description was created by
for 843ed967a7c63db4e302a16ca04060b96856d465. You can customize this summary. It will automatically update as commits are pushed.
Disclaimer: Experimental PR review
Greptile Summary
Updated On: 2025-09-09 06:09:40 UTC
This PR addresses a localization issue where non-ASCII characters (specifically Chinese characters) were being escaped to \uXXXX format in JSON serialization throughout the Langfuse Python SDK. The fix adds the ensure_ascii=False parameter to all json.dumps() calls across four key files to preserve Unicode characters in their original readable form.
The changes affect multiple serialization touchpoints:
- langfuse/_client/attributes.py: Core serialization for OpenTelemetry span attributes
- langfuse/_utils/request.py: API request data serialization sent to Langfuse servers
- langfuse/_client/utils.py: Debug output formatting for span data
- langfuse/_task_manager/score_ingestion_consumer.py: Score processing serialization and size calculations
All modified files use the existing EventSerializer class but now explicitly disable ASCII-only encoding. This ensures that when users trace LLM applications containing Chinese text (or other Unicode content), the data remains human-readable in the Langfuse UI, logs, and debug output. The change maintains JSON validity while improving the user experience for international developers.
Confidence score: 5/5
- This PR is safe to merge with minimal risk as it only affects JSON output formatting without changing data structures or API compatibility
- Score reflects that the changes are straightforward, well-targeted, and follow a consistent pattern across all affected files
- No files require special attention as all changes follow the same simple pattern of adding the
ensure_ascii=Falseparameter
Hi @RiviaAzusa , thank you for the contribution! If possible, could you add at least one test for this behavior in order to ensure not running into this problem again? Thanks!
I really want this PR is merged.