warcbase
warcbase copied to clipboard
Memory Issues on Large WARC Files
I've been tinkering around with @dportabella's #246 issue, as we also have some very large WARCs in a collection (i.e. some of 7GB, others of 40,50,60GB). We do run into Java Heap Space issues w/ large WARC files.
Most of our development has focused on standard-size Archive-It files, i.e. ~ 1 GB, but looks like there are lots of larger ones out there.
Is there any tweak we can make to loadArchives to better parse large WARC files?