BERTopic icon indicating copy to clipboard operation
BERTopic copied to clipboard

`seed_topic_list` throwing error related to inhomogeneous shape after 1 dimensions

Open Adi-ds opened this issue 2 years ago • 2 comments

I am using the following code to use seed_topic_list

embedding_model = 'all-mpnet-base-v2'
word_list = ["Bullion",'market','price','commodity',"precious", "metal",'gilt','carat','aurum','world', 'gold', 'council','mine','mining','bitcoin','forecast','bank','liquidity','ingot','stocks','delivery','settlement','ETF']
word_lists = [[word.lower()] for word in word_list]

model = BERTopic(
            verbose=True,
            min_topic_size=5,
            language="english",
            seed_topic_list = word_list,
            embedding_model = SentenceTransformer(embedding_model)
        )
topics, probs = model.fit_transform(df['news_article'])

I am getting the following error!

Screenshot from 2024-02-16 17-19-37

Can somebody tell me, what is the correct way to use seed_topic_list?

Adi-ds avatar Feb 16 '24 11:02 Adi-ds

I believe this is a known issue. Could you check those for a solution? I believe it had something to do with specific versions of numpy.

MaartenGr avatar Feb 18 '24 18:02 MaartenGr

@Adi-ds I downgraded my version of numpy to 1.22.4 and it worked for me. This link might help you. This link might help.

cayaluke avatar Mar 23 '24 01:03 cayaluke