Amol Lele
Amol Lele
> How is this logic being tested? This change we may not want to merge. It made to enable the spot training testing.
In the test script the following tokenizer function when invoked while mapping the dataset changes the datatype of 'os.environ' from 'os._Environ' to 'dict' tokenizer = AutoTokenizer.from_pretrained(model_checkpoint, use_fast=True) This causes get()...
IMO we should file an issue with transformers package.
The fix is checkedin in for 1.6 which avoids registering hook to non-training activities. It is currently under review.
@marcoabreu The blc_output.txt file was not needed. I have removed it. The url_list.txt contains the list of URLs that are publicly accessible. I have added the README.md file as well.