diskimageprocessor icon indicating copy to clipboard operation
diskimageprocessor copied to clipboard

Disk image processor won't process more than one img at a time

Open KyleDennis03 opened this issue 6 years ago • 4 comments

Hello, when I try to use Disk image processor to process a batch of .img files made from floppy disks, the image processor only processes one at a time and makes one SIP, instead of making multiple individual SIPs. What I have tried so far is to isolate all of the .img files in its own folder (meaning no log files, no other folders, just the .img files in the folder) and selecting this folder as the directory to be processed. Before I process, I do not select any of the options offered at the bottom of the window (bag SIPs, run bulk_extractor, ect.). When I do click Start scan, only the first img. file is processed (first being the first .img file in the list when sorted by file name). Can anyone help me out with this please? Thank you!

KyleDennis03 avatar Mar 25 '19 15:03 KyleDennis03

Hi Kyle,

I think it might be your folder structure that's the issue. The Disk Image Processor expects that all of your disk images (as well as log files, photos, and any other files) are all in the same (flat) folder. It doesn't look into any subfolders, so if each .img file is in its own folder it won't pick them up.

If you put all of your disk images in the same folder, it should process each one of them in turn.

tw4l avatar Apr 05 '19 13:04 tw4l

Hi Tim, thanks for your help! However, I just tried that, and what ended up happening is that Disk Image Processor processed the first 20 disk image/log file pairs out of the total 80 and then just quit processing. When I looked at the log file (see attached), it appeared as though Disk Image Processor either can't recognize/work with the FAT12 file system, or it just doesn't recognize that the file is a disk image at all, and skips it. Either way, no files are ever extracted. After that, I tried processing the un-processed image files in a folder that was separate from the original one, and I wind right back up to where I started, with disk image processor just processing the first image and nothing else. Fortunately, the one SIP that was created by that did yield actual files from the image like it is supposed to. Do you have any ideas? Because I have tried pretty much everything that I can think of. Thanks again for your help though, I know you're busy and I appreciate it!

diskimageprocessor.log

KyleDennis03 avatar May 29 '19 16:05 KyleDennis03

Hi Kyle,

My best guess is that there's something about disk image ARCH281911.img (or the next one) that's causing the script to crash - otherwise, even if it was unable to carve files for a disk, it should still continue on to the next file. The Disk Image Processor should usually be able to export files from disk images with a FAT file system, as the Sleuth Kit's tsk_recover is quite good at that.

My next step debugging would be to call the script from the command line against ARCH281911.img and the image after it, so that you can see the terminal output and diagnose. So first isolate just those 2 disk images files into a new folder (let's call it "/home/bcadmin/Desktop/test_folder" for now).

The following terminal command should work:

python3 /usr/share/ccatools/diskimageprocessor/diskimageprocessor.py /home/bcadmin/Desktop/test_folder /home/bcadmin/Desktop/test_out

If my guess is correct, you'll receive an error message and the script will crash. If so, go ahead and copy the terminal output into a message and I'd be happy to take a look. In the past we ran into some issues with non-UTF8 characters in the Seigfried CSV which caused crashes - I tried fixing that, but it's possible that issue has resurfaced?

On Wed, May 29, 2019 at 12:27 PM KyleDennis03 [email protected] wrote:

Hi Tim, thanks for your help! However, I just tried that, and what ended up happening is that Disk Image Processor processed the first 20 disk image/log file pairs out of the total 80 and then just quit processing. When I looked at the log file (see attached), it appeared as though Disk Image Processor either can't recognize/work with the FAT12 file system, or it just doesn't recognize that the file is a disk image at all, and skips it. Either way, no files are ever extracted. After that, I tried processing the un-processed image files in a folder that was separate from the original one, and I wind right back up to where I started, with disk image processor just processing the first image and nothing else. Fortunately, the one SIP that was created by that did yield actual files from the image like it is supposed to. Do you have any ideas? Because I have tried pretty much everything that I can think of. Thanks again for your help though, I know you're busy and I appreciate it!

diskimageprocessor.log https://github.com/CCA-Public/diskimageprocessor/files/3233459/diskimageprocessor.log

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/CCA-Public/diskimageprocessor/issues/41?email_source=notifications&email_token=ABTSDFB5E7YFW3EQD24ONOLPX2VFTA5CNFSM4HDBRIUKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODWP4MRA#issuecomment-497010244, or mute the thread https://github.com/notifications/unsubscribe-auth/ABTSDFCMRXMREO3POE2EPXTPX2VFTANCNFSM4HDBRIUA .

tw4l avatar Jun 03 '19 16:06 tw4l

Hi Tim, thanks so much for your ideas. Stefana and I looked into the problem and after discovering that the images cannot be mounted, we determined that the issue is simply that the disk images are un-readable due to the floppy disks being in poor condition at the time of imaging. Since Disk Image Processor stops processing once it hits a disk image it cannot process, I can't figure out which images are readable and which ones aren't.

The solution we found was to just brute-force it and try to process each disk image individually to see which ones work, and any images that are not readable won't be bothered to be ingested into Archivematica, since we figure that if they're unusable now, they're probably still going to be unusable xx years in the future from now. However, as a last resort, I'll try to directly mount some of the disks to see if they're at all usable, as opposed to working from the disk image. I don't have high expectations for this to work, though.

In any case, thanks so much for taking the time to do this, we all really appreciate it!

KyleDennis03 avatar Jul 04 '19 18:07 KyleDennis03