IFIscripts icon indicating copy to clipboard operation
IFIscripts copied to clipboard

Adding some basic automated validations

Open kieranjol opened this issue 7 years ago • 3 comments

It's worth performing a zip validation on .docx files. It's also worth decoding AV files with ffmpeg and catching errors - something like

 ffmpeg -v error -i corrupto.mov -f null -
[h264 @ 0x7fc9ac0b5200] Invalid NAL unit size (-1 > 6032).
[h264 @ 0x7fc9ac0b5200] Error splitting the input into NAL units.
Error while decoding stream #0:0: Invalid data found when processing input
[h264 @ 0x7fc9ac03ac00] Invalid NAL unit size (-1 > 20477).
[h264 @ 0x7fc9ac03ac00] Error splitting the input into NAL units.
Error while decoding stream #0:0: Invalid data found when processing input

And of course mediaconch - but possibly JHOVE as well for docs? That TIFF validator thingy too?

kieranjol avatar Aug 16 '18 15:08 kieranjol

How did I leave out the exiftool check for chunks of binary zeroes. Perhaps there's some sort of python bitwise operator that can scan for anomalies that exiftool can't.

kieranjol avatar Aug 16 '18 16:08 kieranjol

Might as well roll the filename character/ length check in here too.a sort of pre-ingest validation prior to sipcreator. Maybe it's the first thing sipcreator does?

kieranjol avatar Aug 16 '18 16:08 kieranjol

👀

mcampos-quinn avatar Aug 16 '18 16:08 mcampos-quinn