Wrong control characters escaping
Action. Save a configuration with a sting containing '\\r' or '\\n'.
Expected result. The saved configuration contains strings with '\\r' or '\\n'.
Actual result. The saved configuration contains strings with '\r' or '\n' instead.
Details. Save a configuration.
>>> import pyhocon
>>> config = pyhocon.ConfigTree()
>>> config['path'] = r'c:\some\path'
>>> config['path']
'c:\\some\\path'
>>> with open('test.conf', 'w') as f:
... f.write(pyhocon.HOCONConverter.to_hocon(config, indent=4))
...
21
>>> config = pyhocon.ConfigFactory.parse_file('test.conf')
>>> config['path']
'c:\\some\\path'
The result is as expected.
Now save a configuration with a string containing '\\r'.
>>> config['path'] = r'c:\result\dir'
>>> config['path']
'c:\\result\\dir'
>>> with open('test.conf', 'w') as f:
... f.write(pyhocon.HOCONConverter.to_hocon(config, indent=4))
...
22
>>> config = pyhocon.ConfigFactory.parse_file('test.conf')
>>> config['path']
'c:\result\\dir'
Expected c:\\result\\dir.
Save a configuration with a string containing '\\n'.
>>> config['path'] = r'c:\next\dir'
>>> config['path']
'c:\\next\\dir'
>>> with open('test.conf', 'w') as f:
... f.write(pyhocon.HOCONConverter.to_hocon(config, indent=4))
...
20
>>> config = pyhocon.ConfigFactory.parse_file('test.conf')
>>> config['path']
'c:\next\\dir'
Expected c:\\next\\dir.
Python string literals interpret the \ as the start of an escape sequence. Characters \n and \r represent newlines and carriage returns, respectively. It will not render the value without escaping it (hence why when you created config['path'] you had to use the r raw string literal prefix). You can use / instead (i.e. c:/result/dir which will still work on Windows).
I cannot agree with you. JSON escape sequences are intended exactly to take care of such cases. So if you put any value into a config tree, you must get exactly this value back.
Deserialized | Python | Serialized (expected) | Serialized (actual)
---------------------------------------------------------------------------------
c:\next\dir | 'c:\\next\\dir' or | c:\\next\\dir | c:\next\dir
| r'c:\next\dir' | |
---------------------------------------------------------------------------------
c: | 'c:\next\\dir' | c:\next\\dir | c:\next\dir
next\dir | | |
You can easily see that the serialized version must be exactly the same as Python non-raw representation.
In the converter.py not all escape sequences handled. The full list of short escape sequences is '"\/\b\f\n\r\t'.
There are more observations.
Every character in the range U+0000 ... U+001F must be escaped like "\u0000" ... "\u001f" except the mentioned special cases that may be escaped in the short form like "\n".
The test
# -*- encoding: utf-8 -*-
from pyhocon import ConfigTree
from pyhocon.converter import HOCONConverter
def to_hocon(obj):
return HOCONConverter.to_hocon(ConfigTree(obj), compact=True)
class TestConverter(object):
def test_escape_control_characters(self):
assert r'a = "\u0000"' == to_hocon({'a': '\x00'})
fails
test_converter.py F
tests\test_converter.py:11 (TestConverter.test_escape_control_characters)
a = "\x00" != a = "\\u0000"
Expected :a = "\\u0000"
Actual :a = "\x00"