OpenKiwi issues

What are source_pos and target_pos in the train_config.yaml?

After training a model, a `train_config.yaml` is generated with the parameters used during that training. Under `data -> inputs ` there are some references to the `source_pos` and `target_pos` fields....

zolastro

Pretrain config file

I was trying to pretrain a PredictorEstimator model with parallel data as you show in your [documentation](https://unbabel.github.io/OpenKiwi/usage.html#training-and-pretraining), but I could not find any **data** yaml configuration file to use as...

zolastro

Update predictor.yaml

1

Solves #104

zolastro

Do you need to tokenize your data when using a BERT/ROBERTA model?

Considering that these models have their own tokenization and BPE models, what is the format of the input files to train a QE model using any of this LM? Should...

zolastro

PicklingError: Can't pickle <class 'kiwi.data.encoders.wmt_qe_data_encoder.InputFields[PositiveInt]'>: attribute lookup InputFields[PositiveInt] on kiwi.data.encoders.wmt_qe_data_encoder failed

2

**Describe the bug** In Windows10,when I try to train_from_file in my Jupyter,I will fail in "Validation sanity check" and the erro is PicklingError:"Can't pickle : attribute lookup InputFields[PositiveInt] on kiwi.data.encoders.wmt_qe_data_encoder...

Blinkblade

bug

Got exception when import kiwi

**Describe the bug** I had already installed openwiki in a conda env. When I import kiwi, I got this kind of exception ImportError: cannot import name 'SAVE_STATE_WARNING' from 'torch.optim.lr_scheduler' **To...

gaohuan2015

bug

pkgutil.iter_modules() error: 'PosixPath' object has no attribute 'startswith'

**Describe the bug** `pkgutil.iter_modules` breaks if a `Path` is passed instead of a `str` on some versions of python. This is due to a regression in Python. See the original...

mtreviso

bug

some confusions

after i trained my model successfully, the result reports shows the ensemble row . so anyone konw if it does ensemble? thank you.

leileilin

OpenKiwi always download the tokenizer files for XLMRoberta even if a local path is configured.

2

When I am training the XLM-Roberta based QE system, I pre-downloaded the pre-trained XLM-Roberta model from huggingface's library and modified the field `system.model.encoder.model_name` in `xlmroberta.yaml` from the default `xlm-roberta-base` to...

yym6472

bug

enhancement

Fix bug that always downloads tokenizer files.

1

When creating `XLMRobertaTextEncoder` object, the tokenizer name will be rewritten to `xlm-roberta-base` if a local model path is configured, so that the framework will always download the tokenizer files via...

yym6472

OpenKiwi
OpenKiwi copied to clipboard

Metadata

What are source_pos and target_pos in the train_config.yaml?

Pretrain config file

Update predictor.yaml

Do you need to tokenize your data when using a BERT/ROBERTA model?

PicklingError: Can't pickle <class 'kiwi.data.encoders.wmt_qe_data_encoder.InputFields[PositiveInt]'>: attribute lookup InputFields[PositiveInt] on kiwi.data.encoders.wmt_qe_data_encoder failed

Got exception when import kiwi

pkgutil.iter_modules() error: 'PosixPath' object has no attribute 'startswith'

some confusions

OpenKiwi always download the tokenizer files for XLMRoberta even if a local path is configured.

Fix bug that always downloads tokenizer files.

← Metadata

Owner

Metadata

OpenKiwi OpenKiwi copied to clipboard

Metadata

← Metadata

Owner

Metadata

OpenKiwi
OpenKiwi copied to clipboard