Michał Siedlaczek

Results 155 comments of Michał Siedlaczek

@JMMackenzie I'm wondering about the `id` field here. Do you know that this is in fact a number? I suppose this may be a format that Anserini uses internally, and...

Merged this so that I can move forward with other work. @amallia if you have any comments, please open a ticket (or tickets) and we can address it there.

That's a good question. It might actually be easier to have CIFF and create all of the metadata files with the new script. Otherwise we'll need to either use low-level...

The docker image I'm creating for the tutorial has `ciff2pisa`, that's not a problem. I've tested running it, which looks something like `pisa index ciff file.ciff ...`, which then does...

Hi, could you clarify in the PR description what this change is addressing?

@oxnz sounds great, we're stretched a bit thin, so these type of things often are left unresolved, so this kind of effort is always appreciated. Re: boost, I am aware...

Hi @jjsaas This is a good question. Unfortunately, I think there's no shortcut here. > Would treating a phrase as a term be sufficient? It's not that simple. In theory,...

I think the two-stage is neither interesting or that useful, because it doesn't really do _the same thing_, does it? To clarify, you mean: get results, filter out based on...

What do you mean by "check documents via forward index"? I assumed you meant after retrieval, hence my point about filtering. On the other hand, if we do it _during_...

Ah, ok, I get it now; yes, I was thinking top-k, but if you do a full conjunction and then filter, that would work. The problem is that our non-ranked...