massive icon indicating copy to clipboard operation
massive copied to clipboard

Error while creating predictions on heldout dataset

Open iamsimha opened this issue 3 years ago • 4 comments

Steps to reproduce:

  1. Create new dataset using create_hf_dataset.py script
  2. In the config, point to your finetuned model and new dataset. We are using XLMR model.

Running torchrun --nproc_per_node=1 scripts/predict.py -c examples/xlmr_base_test_20220411.yml

throws the below error.

Traceback (most recent call last): File "/local/home/desktop/Experiments/massive/scripts/predict.py", line 112, in main() File "/local/home/desktop/Experiments/massive/scripts/predict.py", line 102, in main outputs = trainer.predict(test_ds, tokenizer=tokenizer) File "/home/desktop/Experiments/massive/src/massive/utils/trainer.py", line 188, in predict output = self.evaluate( File "/home/desktop/Experiments/massive/src/massive/utils/trainer.py", line 142, in evaluate output = eval_loop( File "/home/desktop/anaconda3/envs/massive/lib/python3.9/site-packages/transformers/trainer.py", line 2314, in evaluation_loop for step, inputs in enumerate(dataloader): File "/home/desktop/anaconda3/envs/massive/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 652, in next data = self._next_data() File "/home/desktop/anaconda3/envs/massive/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 692, in _next_data data = self._dataset_fetcher.fetch(index) # may raise StopIteration File "/home/desktop/anaconda3/envs/massive/lib/python3.9/site-packages/torch/utils/data/_utils/fetch.py", line 52, in fetch return self.collate_fn(data) File "/home/desktop/Experiments/massive/src/massive/loaders/collator_ic_sf.py", line 64, in call label = entry['slots_num'] KeyError: 'slots_num'

iamsimha avatar Jul 31 '22 10:07 iamsimha

~~Hi @iamsimha , greetings. To resolve this error, you must point to the numerical mapping for your slots. EX: https://github.com/alexa/massive/blob/0d474f326086d01fa320e081e12a7cea5950cfe3/examples/mt5_base_t2t_mmnlu_20220720.yml#L34~~

jgmf-amazon avatar Aug 04 '22 17:08 jgmf-amazon

~~Please let us know if that works. Thanks.~~

jgmf-amazon avatar Aug 04 '22 17:08 jgmf-amazon

Ah, wait, maybe I read your traceback too quickly. Let me check into this a little further.

jgmf-amazon avatar Aug 04 '22 17:08 jgmf-amazon

So in my local version of the huggingface-ified evaluation data, created using scripts/create_hf_dataset.py, for each record there is a slots_str key with an empty value. This must be absent in your version of the evaluation data, right? Options are to either (A) add it to yours or (B) do a code change to allow the collator, etc, to work without it. Option B is a better longterm solution, but I'm not sure if we'll have bandwidth on our side in the near term. Please let us know if Option A is workable. Thanks!

jgmf-amazon avatar Aug 04 '22 18:08 jgmf-amazon