Chatistics icon indicating copy to clipboard operation
Chatistics copied to clipboard

charmap decode error - Whatsapp

Open strod opened this issue 6 years ago • 4 comments

after run "python parse.py whatsapp

Traceback (most recent call last): File "parse.py", line 83, in ArgParse() File "parse.py", line 41, in init getattr(self, args.command)() File "parse.py", line 79, in whatsapp main(args.own_name, args.file_path, args.max, args.infer_datetime) File "C:\Users\rodrigo.teixeira\Documents\GitHub\Chatistics-master\parsers\whatsapp.py", line 62, in main data = parse_messages(files, own_name, infer_datetime) File "C:\Users\rodrigo.teixeira\Documents\GitHub\Chatistics-master\parsers\whatsapp.py", line 85, in parse_messages regex_message = infer_datetime_regex(f_path) File "C:\Users\rodrigo.teixeira\Documents\GitHub\Chatistics-master\parsers\whatsapp.py", line 24, in infer_datetime_regex for c, line in enumerate(f): File "C:\ProgramData\Anaconda3\envs\chatistics\lib\encodings\cp1252.py", line 23, in decode return codecs.charmap_decode(input,self.errors,decoding_table)[0] UnicodeDecodeError: 'charmap' codec can't decode byte 0x90 in position 4095: character maps to

strod avatar Jan 17 '20 13:01 strod

i get the similar error.

(chatistics) C:\Users\User\Desktop\Chatistics-master>python parse.py whatsapp --own-name Owner
2020-01-17 17:18:49,410 [INFO ] [parsers.what]: Parsing Whatsapp data...
2020-01-17 17:18:49,410 [INFO ] [parsers.what]: Reading raw_data/whatsapp\whatsapp.txt
Traceback (most recent call last):
  File "parse.py", line 83, in <module>
    ArgParse()
  File "parse.py", line 41, in __init__
    getattr(self, args.command)()
  File "parse.py", line 79, in whatsapp
    main(args.own_name, args.file_path, args.max, args.infer_datetime)
  File "C:\Users\User\Desktop\Chatistics-master\parsers\whatsapp.py", line 62, in main
    data = parse_messages(files, own_name, infer_datetime)
  File "C:\Users\User\Desktop\Chatistics-master\parsers\whatsapp.py", line 85, in parse_messages
    regex_message = infer_datetime_regex(f_path)
  File "C:\Users\User\Desktop\Chatistics-master\parsers\whatsapp.py", line 24, in infer_datetime_regex
    for c, line in enumerate(f):
  File "C:\Users\User\Anaconda3\envs\chatistics\lib\encodings\cp1252.py", line 23, in decode
    return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x8d in position 2485: character maps to <undefined>

hodanli avatar Jan 17 '20 14:01 hodanli

Is it possible that you are using Python 2? Python 3 should open text files in UTF-8 encoding by default.

mar-muel avatar Jan 18 '20 17:01 mar-muel

I was getting the same error. I saved the .txt file using notepad as Unicode instead of UTF-8. The error message goes away but I think nothing is getting parsed.

omennemo avatar Jan 22 '20 04:01 omennemo

@strod @hodanli @omennemo can you check if #50 solved the issue for you?

MasterScrat avatar Jan 22 '20 21:01 MasterScrat