There are Chinese characters in my project, but after calling the visualize_document_datamap() method, the characters appear as garbled text.
Have you searched existing issues? 🔎
- [X] I have searched and found no existing issues
Desribe the bug
fig = topic_model.visualize_document_datamap( sentences, topics=topics, reduced_embeddings=reduced_embeddings, #custom_labels=custom_labels, title='文档和主题的分布', sub_title='基于 BERTopic 的主题建模', width=1200, height=1200 ) Even after setting plt.rcParams['font.sans-serif'] = ['SimHei'], I still can't see the characters.
Reproduction
from bertopic import BERTopic
# with the reduced embeddings
reduced_embeddings = UMAP(n_neighbors=15, n_components=2, min_dist=0.0, metric='cosine').fit_transform(embeddings)
fig = topic_model.visualize_document_datamap(
sentences,
topics=topics,
reduced_embeddings=reduced_embeddings,
#custom_labels=custom_labels,
title='文档和主题的分布',
sub_title='基于 BERTopic 的主题建模',
width=1200,
height=1200
)
BERTopic Version
0.16.4
Hmmm, I'm not entirely sure what is needed here. Have you tried posting an issue on the DataMapPlot repository? I think there isn't much to do from my end since I'm just calling that package and passing the data.
Can the "visualize_document_datamap()" method set font display parameters?
For future people, please see my reply here: https://github.com/TutteInstitute/datamapplot/issues/50