Martin Fajčík

Results 6 comments of Martin Fajčík

> I also think it would be extremely helpful if the API could provide the top-k log probabilities of each predicted token. Yes, this would allow evaluating Gemini with threshold-free...

Agree, this would be very useful. Would it be possible to implement sharding for `convert_dataset_json.py`? Simply add extra parameters to specify `# of shards` and `index of shard`. Script could...

Isn't enough to just run the script in parallel, and merge the mds shards with this method? https://github.com/mosaicml/llm-foundry/blob/f43d1cfb1ef8f38ca90fee68b0643f45d6d5b2da/llmfoundry/utils/data_prep_utils.py#L29 Currently, I am trying it like this. I have large jsonl file....

> Isn't enough to just run the script in parallel, and merge the mds shards with this method? > > https://github.com/mosaicml/llm-foundry/blob/f43d1cfb1ef8f38ca90fee68b0643f45d6d5b2da/llmfoundry/utils/data_prep_utils.py#L29 > > Currently, I am trying it like this....

To add more context, with @mdocekal we found that harness truncates task description, when having n-shot prompt. At least for GPT-2-XL and accelerate model. We would like to truncate the...

we found our "hacky" way to do what we wanted [here](https://github.com/DCGM/lm-evaluation-harness/commit/9f43d701cdcf673202744b0a1987cb109221b116#diff-28ae7c27276e965d9e744d9a91db657af52f4b0a74c2baa3976b81511f8777a5R108). The question remains whether we should try, implement and pull request such a thing into lm-harness. We are developing...