Guanqun Yang
Guanqun Yang
@martiansideofthemoon Thank you for your prompt reply! I reconfigured the whole environment using the `virtualenv` (rather than `conda` in the question). I think the style transfer on Shakesphere is runnable...
@martiansideofthemoon Thanks for your reply! I removed all `fairseq` installed globally, started afresh with a newly cloned repo, and configured the environment as below. **But it seems a `fairseq` will...
@martiansideofthemoon I managed to find a workaround after some attempts. I will post my solution after my experiments.
@martiansideofthemoon Just curious, how could I also load `gpt2-small` as you did? It seems that this is not offered in the [HuggingFace model hub](https://huggingface.co/models).
Thank you for your prompt response @young-geng! But after correcting the said mistake to the expected `use_fast=False` and rerunning the entire evaluation gave me the same near-random results (around 25%),...
@chi2liu I am trying to reproduce the numbers of [Open LLM Benchmark](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard), which specifies the number of shots for each task.
@young-geng Thank you for reproducing the results! Did you try to obtain the write-out `.json` files? I computed the metrics based on those files.
@young-geng Thank you for reproducing the evaluations! It could be something subtle that causes the issues. Let me double check the report back here. Also, are you using the same...
@young-geng It seems that I have located the issue. I am able to reproduce the reported number using the command: ```bash python ../lm-evaluation-harness/main.py \ --model hf-causal-experimental \ --model_args pretrained=openlm-research/open_llama_7b,use_accelerate=True,dtype=half \...
@s @kvtoraman The answer posted [here](https://stackoverflow.com/a/25027182/7784797) could server as a workaround by skipping cases where the runtime is too long. For example, for the edge case ``` http://google.com/.......................... ``` The...