outlook_msg
outlook_msg copied to clipboard
Some email attachments cannot be extracted
I have nearly no familiarity with the .msg file format, but I noticed some attachments weren't extracted at all from a .msg file I was processing. The reason looks to be found in the module message_file_storage.py:
def numbered_storage_names(self, prefix):
for number in itertools.count():
proposed_name = prefix + f'{number:08d}'
if proposed_name in self.storage:
yield proposed_name
else:
break
There a sequence of proposed names is looked up in self.storage, and these names are formed by concatenating the prefix by a properly formatted integer (base 10). Turns out though some attachments ids are expressed in the hex base, so not reachable with a base 10. Converting the id fixed the issue to me.
def numbered_storage_names(self, prefix):
for number in itertools.count():
number = hex(number)[2:]
proposed_name = prefix + f'{number:>08}'
if proposed_name in self.storage:
yield proposed_name
else:
break