Fix issues with JSON format and handling of non-existing file
Hi, I stumbled upon three issues. They are independent, but I think they are too small, so making three separate PRs seems unnecessary to me.
- JSON can not be printed
process_onenote_filedoes not print document in JSON ifjson_outputparameter is set. Instead, it always returns it. This return value is not even used in the main. This fork fixes use ofjson_outputparameter. IfTrue,process_onenote_filewill print document in JSON form instead of the default text form. The default text form will be used otherwise. No value will be returned.$ git checkout 74d7b05 # upstream main $ pyonenote --json --file <any_onenote_file> $ # nothing $ git checkout e363ca6 # forked main $ pyonenote --json --file <any_onenote_file> {"headers": {"guidFileType": ... - Fix typo
OneDocment->OneDocumentI think it was not intended at least based on the module's name. - Fix handling of non-existing file
$ git checkout 74d7b05 # upstream main $ pyonenote --file nonexisting_file Traceback (most recent call last): File "/home/user/pyOneNote/virtualenv/bin/pyonenote", line 33, in <module> sys.exit(load_entry_point('pyOneNote', 'console_scripts', 'pyonenote')()) File "/home/user/pyOneNote/pyOneNote/Main.py", line 84, in main sys.exit("File: %s doesn't exist", args.file) TypeError: exit expected at most 1 argument, got 2 $ git checkout e363ca6 # forked main $ pyonenote --file nonexisting_file File: 'nonexisting_file' doesn't exist
This JSON object is meant to be used in another application in which pyonenote is called as a library. However, I agree because it is a switch, user should see something when they use it. I will try to think about this a little more to see how we can handle both scenarios.
If I may, I would suggest to always return either data (dictionary object) or event better document (OneDocument object). Then any user can easily convert it either into JSON (document.get_json()) or into string (json.dumps(document.get_json())). I think it is better to let an user decide.
Also, I think it could be better to have a separate function which will be expected to be used as API. Some function which never print anything. For example (only first ~11 lines are modified):
def process_onenote_file(file):
if not check_valid(file):
log.error("please provide valid One file")
exit()
file.seek(0)
return OneDocument(file)
def process_and_print_onenote_file(file, output_dir, extension, json_output):
document = process_onenote_file(file)
data = document.get_json()
if not json_output:
print('Headers\n####################################################################')
indent = '\t'
for key, header in data['headers'].items():
print('{}{}: {}'.format(indent, key, header))
print('\n\nProperties\n####################################################################')
indent = '\t'
for propertySet in data['properties']:
print('{}{}:'.format(indent, propertySet['type']))
for property_name, property_val in propertySet['val'].items():
print('{}{}: {}'.format(indent+'\t', property_name, str(property_val)))
print("")
print('\n\nEmbedded Files\n####################################################################')
indent = '\t'
for name, file in data['files'].items():
print('{}{}:'.format(indent, name))
print('\t{}Extension: {}'.format(indent, file['extension']))
print('{}'.format( get_hex_format(file['content'][:256], 16, indent+'\t')))
if extension and not extension.startswith("."):
extension = "." + extension
counter = 0
for file_guid, file in document.get_files().items():
with open(
os.path.join(output_dir,
"file_{}{}{}".format(counter, file["extension"], extension)), "wb"
) as output_file:
output_file.write(file["content"])
counter += 1
else:
print(json.dumps(data))
I think it is a good suggestion, will refactor the code based on your suggestion