Error in __init_rope of KblamLlamaAttention
It seems that the configuration information in 'meta-llama/Llama-3.2-1B-Instruct/resolve/main/config.json' has changed since the code was used the last time.
Running the training on the enron dataset gives:
File "/home/fokus/Thomas/KBLaM/src/kblam/models/llama3_model.py", line 118, in __init__ self._init_rope() ~~~~~~~~~~~~~~~^^ File "/home/fokus/Thomas/KBLaM/src/kblam/models/llama3_model.py", line 128, in _init_rope scaling_type = self.config.rope_scaling["type"] ~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^ KeyError: 'type'
Printing out self.config.rope_scaling gives:
{'factor': 32.0, 'high_freq_factor': 4.0, 'low_freq_factor': 1.0, 'original_max_position_embeddings': 8192, 'rope_type': 'llama3'}
I assume that the rope_type is fetched from https://huggingface.co/meta-llama/Llama-3.2-1B-Instruct/resolve/main/config.json.
Changing self.config.rope_scaling["type"] to self.config.rope_scaling["rope_type"] gives now
File "/home/fokus/Thomas/KBLaM/src/kblam/models/llama3_model.py", line 146, in _init_rope raise ValueError(f"Unknown RoPE scaling type {scaling_type}") ValueError: Unknown RoPE scaling type llama3
since only the values 'linear' or 'dynamic' are allowed in _init_rope()
I believe this is solved by sticking with huggingface 4.46.0 in #40 . Could you let me know if that's fixed things?
I believe this is solved by sticking with huggingface 4.46.0 in #40 . Could you let me know if that's fixed things?
Well, I am not sure. I think I experimenting with training of enron on llama model first, unaware, that the key-value-embeddings should be generated first. So I think it is not related to the transformer==4.46.0 issue. Anyway, I think the problem with the implemented dictionary key "type" and the concrete available dict key "rope_type" still remains ...
I think it is only invisible, if the key-value-embeddings are generated first.