Results 20 comments of Richard Wang

Thanks for the reply @lhoestq ! I have sucessed on `datasets-cli test ./datasets/super_glue --name record --save_infos`, But as you can see, the check ran into `FAILED tests/test_dataset_cards.py::test_changed_dataset_card[super_glue] - V...`. How...

That would be neat! Let me implement it.

Hi all, uses the text mirror mentioned in [the comment](https://github.com/soskek/bookcorpus/issues/24#issuecomment-556024973) above, my pr that adds BookCorpus to HuggingFace/nlp has been merged. (the txt files has been copied to their own...

@shawwn This is exciting ! But I also encountered failed download.

@alexmathfb Sorry for the late reply That would be great!! BTW I recommend creating an issue or a draft pr on HF/datasets, ppl there are willing and able to provide...

Hi @SkafteNicki, I am not clear about how `ignore_index` is handled, https://github.com/PyTorchLightning/metrics/blob/7af6d13b3c2186aacf5594793317c15a199608bb/torchmetrics/functional/classification/stat_scores.py#L111-L113 https://github.com/PyTorchLightning/metrics/blob/7af6d13b3c2186aacf5594793317c15a199608bb/torchmetrics/functional/classification/stat_scores.py#L24-L27 Would it still be good when ignore_index is -100 ?

Sure, one that wants to do masked language model will often see something like this. ``` python batch_size, sequence_length = 4,5 mlm_logits = torch.randn(4,5,128) labels = torch.tensor([ [1234,-100,-100,-100,-100], [-100,-100,-100,7567,-100] [-100,-100,8900,-100,-100]...

Hi all. I wrote a naive solution, which makes the ignored part be incorrect in comparison and then divided by number of unignored. There may be better way, this is...

Try this https://github.com/ymcui/Chinese-ELECTRA

Is there anything new ? This should be set to important because I believe most of data science and ml guys write long jupyter notebook. And we are used to...