STREAM issues

Import time and huggingface hub

1

Importing anything from stream_topic takes quite some time. I guess due to huggingface_hub and some downloads that are done in the background? I am not sure whether this really is...

AnFreTh

Include real word embeddings for TNTM, ETM

- move preprocess word embeddings to dataset class - make word embeddings attribute of dataset class - decouple from datamodule logic - Should be identical to sentence embeddings logic -...

AnFreTh

ReadMe

- Adapt Readme to new package structure - To do just before first release after development is finished

AnFreTh

documentation

Additional Preprocessing

- Check current preprocessing implementation - Include further preprocessing steps if necessary

AnFreTh

help wanted

new languages

- make preprocessing possible for other languages -> see langdetect

AnFreTh

enhancement

Make models accept Chinese datasets

1. Added additional Chinese corpus to the CEDC model 2. Made minor modifications to other models to adapt to Chinese datasets

williamlhy

Issues in TNTM model debugging

2

**When I wanted to use TNTM model, I got the following error.** Code: ``` from stream_topic.models import TNTM from stream_topic.utils import TMDataset dataset = TMDataset() dataset.fetch_dataset("BBC_News") dataset.preprocess(model_type="TNTM") model = TNTM()...

williamlhy

arabic1

ZakAlreich

arabic

ZakAlreich

Issues in DCTE model training

**When I wanted to use DCTE model (here I used local model):** ``` from stream_topic.models import DCTE from stream_topic.utils import TMDataset dataset = TMDataset() dataset.fetch_dataset(name="BBC_News",dataset_path = "/hongyi/STREAM/stream_topic/stream_topic_data/preprocessed_datasets/BBC_News",source = 'local') dataset.preprocess(model_type="DCTE")...

williamlhy

STREAM
STREAM copied to clipboard

Metadata

Import time and huggingface hub

Include real word embeddings for TNTM, ETM

ReadMe

Additional Preprocessing

new languages

Make models accept Chinese datasets

Issues in TNTM model debugging

arabic1

arabic

Issues in DCTE model training

← Metadata

Owner

Metadata

STREAM STREAM copied to clipboard

Metadata

← Metadata

Owner

Metadata

STREAM
STREAM copied to clipboard