STREAM
STREAM copied to clipboard
ACL Python package engineered for seamless topic modeling, topic evaluation, and topic visualization. Ideal for text analysis, natural language processing (NLP), and research in the social sciences, S...
Importing anything from stream_topic takes quite some time. I guess due to huggingface_hub and some downloads that are done in the background? I am not sure whether this really is...
- move preprocess word embeddings to dataset class - make word embeddings attribute of dataset class - decouple from datamodule logic - Should be identical to sentence embeddings logic -...
- Adapt Readme to new package structure - To do just before first release after development is finished
- Check current preprocessing implementation - Include further preprocessing steps if necessary
1. Added additional Chinese corpus to the CEDC model 2. Made minor modifications to other models to adapt to Chinese datasets
**When I wanted to use TNTM model, I got the following error.** Code: ``` from stream_topic.models import TNTM from stream_topic.utils import TMDataset dataset = TMDataset() dataset.fetch_dataset("BBC_News") dataset.preprocess(model_type="TNTM") model = TNTM()...
**When I wanted to use DCTE model (here I used local model):** ``` from stream_topic.models import DCTE from stream_topic.utils import TMDataset dataset = TMDataset() dataset.fetch_dataset(name="BBC_News",dataset_path = "/hongyi/STREAM/stream_topic/stream_topic_data/preprocessed_datasets/BBC_News",source = 'local') dataset.preprocess(model_type="DCTE")...