eschaffn comments

Results 28 comments of


                                            eschaffn

Quiet .merge_topics()

Maybe this comes from re-running the representation model after it merges?

Function used for merging: ```python def merge_topics(distance_threshold, hierarchy_var, topic_model, data): topics_to_merge = [] for merge_candidate in hierarchy_var.iterrows(): distance = merge_candidate[1][-1] if distance

Quiet .merge_topics()

> Hmmm, that might be the `LangChain` backend that you are using but I'm not sure. Do you also get this progress bar when you run `.fit`? Also, I'm not...

Quiet .merge_topics()

I found it. It's `.hierarchical_topics()`. Line 1003 in [_bertopic.py](bertopic/_bertopic.py).

LORA fine-tuning with openlm-research/open_llama_7b as a plugin replacement for decapoda-research/llama-7b-hf

I'm also having a hard time reproducing models that use the Facebook Llama as the base model. Using a slightly modified version of the [Qlora](https://github.com/artidoro/qlora) code and the dataset [here](https://huggingface.co/datasets/ehartford/WizardLM_alpaca_evol_instruct_70k_unfiltered)....

LORA fine-tuning with openlm-research/open_llama_7b as a plugin replacement for decapoda-research/llama-7b-hf

Sorry I don't have evaluation or loss curves to share, but generally OpenLLama models for me have repetition problems, repeating the same or very similar sequence until it reaches the...

LORA fine-tuning with openlm-research/open_llama_7b as a plugin replacement for decapoda-research/llama-7b-hf

> @eschaffn > > ``` > tokenizer = LlamaTokenizer.from_pretrained( > "openlm-research/open_llama_7b", > add_eos_token=True, > add_bos_token=False, > use_fast=False > ) > ``` > > The `add_bos_token=False` was actually an accident. I'd...

LORA fine-tuning with openlm-research/open_llama_7b as a plugin replacement for decapoda-research/llama-7b-hf

I've reran that run with eval_steps = 20 [trainer_state.txt](https://github.com/openlm-research/open_llama/files/11962014/trainer_state.txt) I'm currently running using https://huggingface.co/huggyllama/llama-13b since I ran into the same CUDA errors as you mentioned and this should be done...

LORA fine-tuning with openlm-research/open_llama_7b as a plugin replacement for decapoda-research/llama-7b-hf

[trainer_state.txt](https://github.com/openlm-research/open_llama/files/11968411/trainer_state.txt) Using [FB Llama](https://huggingface.co/huggyllama/llama-13b) the same overfitting issues. The default LORA R value is 64 for QLORA. I've been running with r=32 but maybe this is causing the overfitting. The...