Mehrad Moradshahi

Results 12 comments of Mehrad Moradshahi

Hi, Are there any updates for this issue? I've implemented a similar beam search strategy which uses the _backtrack(...) function from this repo but even with a beam_size of 1,...

@gcampax Also is there anything else here ? Can we merge it?

>Do we even need this PR, given we're not doing translation at test time? I think having a translation endpoint wouldn't hurt even if we don't use it immediately. For...

From [this post](https://discuss.huggingface.co/t/how-to-use-bertmodel/6177), it seems passing `local_files_only=True` when loading the model works too.

Should we start saving the HF config files then? Then we set `local_files_only` to True only if the config file is detected in `--path` and False otherwise (for backward compatibility).

`truncation` is used only for token classification task (where input words and labels need to be aligned) but not for the general encoding which happens in `encode_batch` method. I think...

Alternatively, we can make truncation optional and add a flag for it so the user can decide what to do. Although, I prefer the first approach to avoid silent bugs.

I'm afraid we need to add more assignees then! Addressing disk space requires more than surface changes in bootleg code.

Yes. I'm much in favor of it. More intuitive to index, plus bootleg uses jsonl so I can drop the conversion between jsonl and tsv.