aegisgpt

Results 3 comments of aegisgpt

> The paper says that it only need 350G VRAM to train 175B GPT3 with rank =4. Can you elaborate more about how this is done? Like, do you use...

> @aegisgpt > > > having no v_proj and q_proj in the base model > > By https://huggingface.co/smangrul/twitter_complaints_bigscience_bloomz-7b1_LORA_CAUSAL_LM/blob/main/adapter_config.json , need to change to `query_key_value` for bloom models. Let me know...

> This may be useful: https://github.com/huggingface/peft/blob/main/src/peft/mapping.py Thank you! That helps!