J Mackenzie
J Mackenzie
I usually use Anserini to build CIFF indexes that I import to other tools like PISA, but I almost always forget to use `-optimize` when I index. So this tool...
I'm not sure why these tests fail - I can't seem to generate any meaningful output. It seems the VM is crashing (I get "The forked VM terminated without properly...
I suspect `id` should be treated as a string, and that we should map them. I don't think we should assume that `id` comes as a sorted integer indexed from...
I'm working on this now - Should I create a CIFF, or a PISA canonical? It might be easier to go right from canonical?
So basically we will do: * Download CIFF * CIFF2PISA (? right?) to get the canonical format files * Index the canonical files I am suggesting instead we: * Download...
Neat, that's great. In that case, I'll send a link soon to a ciff. I'm hosting it on UQ's cloud service, I just need to wait for it to propagate....
URL: https://cloud.rdm.uq.edu.au/index.php/s/kdq5Lts5RafBzqF Password: `sigirpadua2025` I need to figure out how we can directly download this via wget or etc - I had a quick tinker but couldn't get it to...
Here is the LSR (100 query sample): https://www.dropbox.com/scl/fi/fkqso0vv8mwq9yaajf4pj/lsr-small.zip?rlkey=ff212z3i0d0y0n45kd5wiekzc&st=1i9clk1h&dl=0
Leaving this issue open - although these are now complete, we need a more permanent home for them.
If there's interest, we can design a positional inverted index. Even easier, though, is to integrate a forward index and handle phrase matching as a two-stage retrieval process. I'm not...