PoisonedRAG How long does it take to evaluate on MS-MARCO?

Evaluating on MS-MARCO seems to take significantly a lot more time than NQ or Hotpot QA, i.e., it just hangs there:

Loading checkpoint shards: 0%| | 0/2 [00:00<?, ?it/s] Loading checkpoint shards: 50%|█████ | 1/2 [00:27<00:27, 27.73s/it]

I was wondering if the authors (or anyone else running the repo) encountered similar issues? Can this be resolved by waiting things out?

Sep 07 '24 05:09 c0ding4ever

Hi, thanks for pointing it out. How is everything going now? Did it be resolved? Based on the information you provided I think that you stuck in loading the model checkpoint, which is not related to MS-MARCO dataset.

Sep 09 '24 17:09 sleeepeer

No, unfortunately the problem persists. I made several attempts, and every time it ran for more than 10 hours on an A6000 and terminated while still loading (as shown above).

Sep 10 '24 00:09 c0ding4ever

It looks strange... Does NQ or HotpotQA work well? Just in MS-MARCO you have this issue?

Sep 10 '24 17:09 sleeepeer

Yes, NQ and HotpotQA both finished running. Only MS-MARCO didn't work.

Sep 10 '24 22:09 c0ding4ever

Hi, I replicated MS-MARCO experiments and it runs well. My device is a single A100 with 80GB VRAM, and 1TB RAM. Maybe the issue comes from the larger memory usage of MS-MARCO.

Sep 13 '24 18:09 sleeepeer

Yes that might be the case. Thank you for looking into this!

I do not have access to A100 so I cannot test it out. Maybe someone else working with MS-MARCO could verify?

Sep 17 '24 16:09 c0ding4ever

Yeah I think we could leave this issue here to see if someone can help.

Sep 23 '24 03:09 sleeepeer