Question about eval parameter.
- What is the difference of
--llm_base_dirand--model_dir?
https://github.com/microsoft/KBLaM/blob/65262680386c5e0218928ed6dd7fcf4736fc8043/experiments/eval.py#L337C6-L337C20
https://github.com/microsoft/KBLaM/blob/65262680386c5e0218928ed6dd7fcf4736fc8043/experiments/eval.py#L349C5-L349C48
Is llm_base_dir the original model dir and model_dir the output saved model like ...\stage1_lr_0.0001_KBTokenLayerFreq3_UseOutlier1_SepQueryHead_UseDataAug__KeyFromkey_all-MiniLM-L6-v2_enron_llama3_step_16000 ?
- Where is the
--query_head_path? How to get it ?
https://github.com/microsoft/KBLaM/blob/65262680386c5e0218928ed6dd7fcf4736fc8043/experiments/eval.py#L365
Hello, I am also reproducing this work. Have you ever encountered this situation, that is, the model is not loaded with kv during eval, and the inference effect is the same as zero shot? @shiwanghua
@Chloe-mxxxxc If the model is not loaded with KV, then the model is exactly the pre-trained model.
So how to pass the parameters of eval ?
How to generate the query_head file ? the .pth file
https://github.com/microsoft/KBLaM/blob/65262680386c5e0218928ed6dd7fcf4736fc8043/experiments/Makefile#L33