LLMeBench
LLMeBench copied to clipboard
Extended experiment configuration backup
As part of the output of a benchmarking experiment, we should write the full configuration used to file. This can be useful to version experiments (for reproducibility). Example configs to maintain:
- Running time and date of the experiment
- Train and test dataset file names and the name of dataset script used to load them.
- Task script name
- Model
- Base Prompt used
- Learning setup (0-shot, few-shot, etc) and details about the setup (e.g., method used to select few shots)
"Might save all of this as a pickled object"