[BUG] git clone issues due to filenames with special characters
Describe the bug
When cloning the repo, I'm encountering some issues where certain tests files are unable to be created. Here's the error trace.
Cloning into 'lighteval'...
Warning: Permanently added 'github.com' (ED25519) to the list of known hosts.
remote: Enumerating objects: 6529, done.
remote: Counting objects: 100% (2557/2557), done.
remote: Compressing objects: 100% (1076/1076), done.
remote: Total 6529 (delta 1983), reused 1481 (delta 1481), pack-reused 3972 (from 3)
Receiving objects: 100% (6529/6529), 3.07 MiB | 8.07 MiB/s, done.
Resolving deltas: 100% (4034/4034), done.
Updating files: 100% (533/533), done.
error: unable to create file tests/reference_details/SmolLM2-1.7B-Instruct-vllm/details_agieval:lsat-rc|0_2025-11-05T14-52-08.352779.parquet: Inva
lid argument
error: unable to create file tests/reference_details/SmolLM2-1.7B-Instruct-transformers/details_arc:challenge|25_2025-11-05T14-43-47.148527.parque
t: Invalid argument
error: unable to create file tests/reference_details/SmolLM2-1.7B-Instruct-vllm/details_arc:challenge|25_2025-11-05T14-52-08.352779.parquet: Inval
id argument
error: unable to create file tests/reference_details/SmolLM2-1.7B-Instruct-vllm/details_hellaswag|10_2025-11-05T14-52-08.352779.parquet: Invalid a
rgument
error: unable to create file tests/reference_details/SmolLM2-1.7B-Instruct-vllm/details_agieval:sat-en|0_2025-11-05T14-52-08.352779.parquet: Inval
id argument
...
error: unable to create file tests/reference_details/SmolLM2-1.7B-Instruct-transformers/details_agieval:aqua-rat|0_2025-11-05T14-43-47.148527.parq
uet: Invalid argument
Filtering content: 100% (61/61), 65.52 MiB | 21.98 MiB/s, done.
fatal: unable to checkout working tree
warning: Clone succeeded, but checkout failed.
You can inspect what was checked out with 'git status'
and retry with 'git restore --source=HEAD :/'
I believe this is due to the filenames with contains special characters like "|" and ":". For example, running this example also gives a similar error.
mkdir "example_|"
mkdir: cannot create directory ‘example_|’: Invalid argument
mkdir "example_:"
mkdir: cannot create directory ‘example_:’: Invalid argument
To Reproduce
git clone [email protected]:huggingface/lighteval.git
Expected behavior
git clone should work.
Version info
Linux (amd64)
hey ! thanks for raising the issue. weird that this happens on linux, do you have git lfs installed ? it might be why it's not working.
hi! I do have git-lfs installed. Aren't the tests/reference_details files committed with git-lfs?
I'm only trying to download one of the files here, and it's showing the same error. Would it not be feasible to change the names of these files? In general it's just bad practice to have files with | or : in them
wget https://github.com/huggingface/lighteval/raw/refs/heads/main/tests/referenc
e_details/Qwen2.5-VL-3B-Instruct-vlm/details_mmmu_pro:standard-4%7C0_2025-11-05T15-23-34.026089.parquet
--2025-11-13 18:15:55-- https://github.com/huggingface/lighteval/raw/refs/heads/main/tests/reference_details/Qwen2.5-VL-3B-Instruct-vlm/details_m
mmu_pro:standard-4%7C0_2025-11-05T15-23-34.026089.parquet
Resolving github.com (github.com)... 140.82.116.4
Connecting to github.com (github.com)|140.82.116.4|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://media.githubusercontent.com/media/huggingface/lighteval/refs/heads/main/tests/reference_details/Qwen2.5-VL-3B-Instruct-vlm/detai
ls_mmmu_pro%3Astandard-4%7C0_2025-11-05T15-23-34.026089.parquet [following]
--2025-11-13 18:15:55-- https://media.githubusercontent.com/media/huggingface/lighteval/refs/heads/main/tests/reference_details/Qwen2.5-VL-3B-Ins
truct-vlm/details_mmmu_pro%3Astandard-4%7C0_2025-11-05T15-23-34.026089.parquet
Resolving media.githubusercontent.com (media.githubusercontent.com)... 185.199.111.133, 185.199.110.133, 185.199.108.133, ...
Connecting to media.githubusercontent.com (media.githubusercontent.com)|185.199.111.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 11538690 (11M) [application/octet-stream]
details_mmmu_pro:standard-4|0_2025-11-05T15-23-34.026089.parquet: Invalid argument
Cannot write to ‘details_mmmu_pro:standard-4|0_2025-11-05T15-23-34.026089.parquet’ (Success).
It is ! I was just trying to find out why it does not work for you
I will open a PR to rename those and tag you to try it out :)
thanks! yeah in my setup im using nfs for storage so special characters are not allowed.
i tried running some evals and caching fails due to the task name being {task_name}|{num_fewshots} in lighteval_task.py and registry.py. changed it to {task_name}-{num_fewshots} which seems to solve the problem. is there other logic somewhere that splits on the | as a delimiter?
hey ! Can you try this and see if thise works for you ? :) #1062
was able to git checkout that branch. thanks!