4scanner icon indicating copy to clipboard operation
4scanner copied to clipboard

Reliability problems, super slow download

Open observeroftime02 opened this issue 5 years ago • 4 comments

I'm running 4scanner to monitor /vg/ and download threads with specific keywords.

At first it appeared to be working. The script creates the folders for the threads and starts downloading. It gets maybe 150 images into a thread, and then doesn't download anything else for a long time. There's no error messages or any other output to the console.

Once it gets to that point, there's maybe 2 or 3 new images that appear every 5 minutes or so, when there's hundreds of other images left to download. When a new thread appears, it adds that thread to the download list, but doesn't download any new images for it. Once it gets to that point, no further images are being downloaded, and even calling 4downloader with the direct thread URL doesn't download any images. At that point, deleting the .4scanner directory is the only thing that will make it download images again.

In general, the download of actual images is excruciatingly slow. Maybe 1 image every other second. I've tried something like the 4chan image scraper script, and it downloads an entire thread worth of images in under a minute.

I'm not sure if there's settings I can change? Here's a config I use

Python version is 3.6.9. Any ideas?

    "searches": [
        {
            "imageboard": "4chan",
            "board": "vg",
            "keywords": [
                "fgog",
                "fgoalter"
            ],
            "folder_name": "dl_dir"
        }
    ]
}`

observeroftime02 avatar Mar 07 '20 23:03 observeroftime02

Hey! I have not seen that behaviour where it starts slowing down after some time, If deleting the .4scanner directory helps it may be a bug with the image duplicate checking feature. For now try turning it off in your searches with "check_duplicate": false while I take a look and let me know if it helps. About the 1 seconds per image this is expected as I am trying not to hammer imageboards with downloads. In theory it is 1 image/sec per threads so if you are downloading 10 threads it should be 10 image/sec. If you want I could make it configurable so that you could set it to like 0.5 second (or 0 second even).

pboardman avatar Mar 08 '20 05:03 pboardman

Thank you for your reply, I'm going to try that out tomorrow and let you know how it goes 👍

observeroftime02 avatar Mar 08 '20 05:03 observeroftime02

Perfect just let me know when you test it out!

Also I just added a throttle option that you can add to you searches. setting it to "throttle": 0 should make the download faster but put more strain on the imageboard. you can update 4scanner with pip3 install -U 4scanner

pboardman avatar Mar 08 '20 20:03 pboardman

same thing happened to me, is it possible that they put some restriction to my IP? i got little greedy with the downloading (16 threads with "throttle": 0). the other 4chan single thread downloader i am using works just fine tho. deleting .4scanner directory nor "check_duplicate": false resolved the issue.

edit: complete reinstall with temp files deletion got it working normally again

th3illu avatar May 14 '20 16:05 th3illu