IFIscripts
IFIscripts copied to clipboard
Adding some basic automated validations
It's worth performing a zip validation on .docx files. It's also worth decoding AV files with ffmpeg and catching errors - something like
ffmpeg -v error -i corrupto.mov -f null -
[h264 @ 0x7fc9ac0b5200] Invalid NAL unit size (-1 > 6032).
[h264 @ 0x7fc9ac0b5200] Error splitting the input into NAL units.
Error while decoding stream #0:0: Invalid data found when processing input
[h264 @ 0x7fc9ac03ac00] Invalid NAL unit size (-1 > 20477).
[h264 @ 0x7fc9ac03ac00] Error splitting the input into NAL units.
Error while decoding stream #0:0: Invalid data found when processing input
And of course mediaconch - but possibly JHOVE as well for docs? That TIFF validator thingy too?
How did I leave out the exiftool check for chunks of binary zeroes. Perhaps there's some sort of python bitwise operator that can scan for anomalies that exiftool can't.
Might as well roll the filename character/ length check in here too.a sort of pre-ingest validation prior to sipcreator. Maybe it's the first thing sipcreator does?
👀