eschaffn

Results 28 comments of eschaffn

Maybe this comes from re-running the representation model after it merges?

Function used for merging: ```python def merge_topics(distance_threshold, hierarchy_var, topic_model, data): topics_to_merge = [] for merge_candidate in hierarchy_var.iterrows(): distance = merge_candidate[1][-1] if distance

> Hmmm, that might be the `LangChain` backend that you are using but I'm not sure. Do you also get this progress bar when you run `.fit`? Also, I'm not...

I found it. It's `.hierarchical_topics()`. Line 1003 in [_bertopic.py](bertopic/_bertopic.py).

I'm also having a hard time reproducing models that use the Facebook Llama as the base model. Using a slightly modified version of the [Qlora](https://github.com/artidoro/qlora) code and the dataset [here](https://huggingface.co/datasets/ehartford/WizardLM_alpaca_evol_instruct_70k_unfiltered)....

Sorry I don't have evaluation or loss curves to share, but generally OpenLLama models for me have repetition problems, repeating the same or very similar sequence until it reaches the...

> @eschaffn > > ``` > tokenizer = LlamaTokenizer.from_pretrained( > "openlm-research/open_llama_7b", > add_eos_token=True, > add_bos_token=False, > use_fast=False > ) > ``` > > The `add_bos_token=False` was actually an accident. I'd...

I've reran that run with eval_steps = 20 [trainer_state.txt](https://github.com/openlm-research/open_llama/files/11962014/trainer_state.txt) I'm currently running using https://huggingface.co/huggyllama/llama-13b since I ran into the same CUDA errors as you mentioned and this should be done...

[trainer_state.txt](https://github.com/openlm-research/open_llama/files/11968411/trainer_state.txt) Using [FB Llama](https://huggingface.co/huggyllama/llama-13b) the same overfitting issues. The default LORA R value is 64 for QLORA. I've been running with r=32 but maybe this is causing the overfitting. The...