LLMeBench
LLMeBench copied to clipboard
Benchmarking Large Language Models
Added Zero-Shot assets for dialect identification on the SHAMI corpus
Add tests for the new model VLLM.
- We need to upload the dataset for download. The are currently on the main server location 'data_for_download' - The datasets comes with four different task definitions, (task 1 with...
File: llmebench/datasets/OSACT4SubtaskB.py Line: 44 If we take the first two as text and label, the labels are ["HS", "NOT_HS", "OFF", "NOT_OFF"]. The "OFF" and "NOT_OFF" labels are for subtaskA. the...