pyfilesystem icon indicating copy to clipboard operation
pyfilesystem copied to clipboard

MemoryFS doesn't support non-ascii encodings

Open GoogleCodeExporter opened this issue 10 years ago • 4 comments

What steps will reproduce the problem?
1. Instantiate a fs.memoryfs.MemoryFS
2. Open a file from the MemoryFS instance with a write mode
3. Write a unicode string containing characters that are not representable in 
ascii.

What is the expected output? What do you see instead?

The MemoryFS should allow arbitrary unicode strings in the body of files. 
Instead, it throws a UnicodeEncodeError attempting to encode the written 
unicode string to ascii.

What version of the product are you using? On what operating system?

On Mac OS X Lion, C Python 2.7.3, a copy of pyfilesystem from July 2nd (forked 
to a private repository but not meaningfully changed)

Please provide any additional information below.

I was able to resolve my issue by simply placing

from __future__ import unicode_literals

at the top of memoryfs.py. I'm not sure if that has implications for other 
clients. 

Original issue reported on code.google.com by [email protected] on 10 Jul 2013 at 4:17

GoogleCodeExporter avatar Apr 11 '15 10:04 GoogleCodeExporter

Seems to work for me. Maybe it was fixed since you forked.

>>> from fs.memoryfs import *
>>> m=MemoryFS()
>>> f=m.open('jp.txt', 'w')
>>> f.write(u'私は学生です')
6L
>>> f.close()
>>> m.tree()
╰── jp.txt
>>> m.getcontents('jp.txt')
'\xe7\xa7\x81\xe3\x81\xaf\xe5\xad\xa6\xe7\x94\x9f\xe3\x81\xa7\xe3\x81\x99'
>>> m.getcontents('jp.txt', 'rt')
u'\u79c1\u306f\u5b66\u751f\u3067\u3059'
>>> print _
私は学生です

Can you still reproduce the error?

Original comment by willmcgugan on 3 Sep 2013 at 1:17

  • Changed state: Accepted

GoogleCodeExporter avatar Apr 11 '15 10:04 GoogleCodeExporter

I copied in the most recent version of pyfilesystem and using your script got 
the same error (I moved the example script directly into a copy of the svn 
read-only pyfilesystem):

# -*- coding: utf-8 -*-

from memoryfs import *
m=MemoryFS()
f=m.open('jp.txt', 'w')
x = u'私は学生です'
f.write(x)

f.close()
m.tree()

--> UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-5: 
ordinal not in range(128)

Are you using a different encoding at the top of the file? I'm also on Python 
2.7.3, so I'm not getting unicode literals everywhere for free.

Original comment by [email protected] on 6 Sep 2013 at 3:30

GoogleCodeExporter avatar Apr 11 '15 10:04 GoogleCodeExporter

That works fine for me. Only difference is I'm on Linux.

That won't work as a way of using the svn version though. memoryfs has a bunch 
of "from fs." imports that will import the installed code. Best to run "python 
setup.py develop".

Original comment by willmcgugan on 6 Sep 2013 at 4:00

GoogleCodeExporter avatar Apr 11 '15 10:04 GoogleCodeExporter

Ok, I'll keep digging on this. Thanks for helping Will.

Original comment by [email protected] on 6 Sep 2013 at 4:05

GoogleCodeExporter avatar Apr 11 '15 10:04 GoogleCodeExporter