yara-python icon indicating copy to clipboard operation
yara-python copied to clipboard

Non-unicode filenames causes UnicodeEncodeError on python3

Open binrush opened this issue 8 years ago • 0 comments

On linux, file names are actually bytes, not unicode. Yara can not scan file containing non-unicode bytes:

import pathlib
import os
import yara
p = pathlib.Path(os.fsdecode(b'/tmp/\x44\xf9'))
p.write_text('malware')
rules = yara.compile('main.yara')
rules.match(str(p)) # UnicodeEncodeError: 'utf-8' codec can't encode character '\udcf9' in position 6: surrogates not allowed

How should I decode bytes filename to pass it to match() ?

binrush avatar Nov 01 '17 06:11 binrush