How to get more than 3 representative docs per topic via get_topic_info()?
Hey, thank you so much for making this library! Super awesome.
I've seen a bunch of issues here requesting this but haven't found a straightforward easy way to specify it via get_topic_info() as it contains a lot of the information I need. I wish there was a parameter in there like get_topic_info(number_of_representative_documents=3) that I could modify.
I'm not sure that _extract_representative_docs will work in my context as I'm using umap, hdbscan, and gpt for topic labels, no tfidf or anything, which seems to be a required parameter
The documents themselves are not saved within BERTopic in part to reduce memory requirements, so it would not be possible to run something like .get_topic_info(numberof_representative_documents=3). You are always using c-TF-IDF since it is part of the default pipeline, so ._extract_representative_docs should work.
so
._extract_representative_docsshould work.
@MaartenGr I couldn't find an example in BERTopic doc. Could you please provide an example?
I think there is a nice example here. It might be nice to have an additional function that re-calculates the representative documents since this question seems to appear frequently.