ETM
ETM copied to clipboard
Topic Modeling in Embedding Spaces
Hi, I cannot understand the expression "recon_loss = -(preds * bows).sum(1)“ in etm.py forward() function. Could you help me explain it? The loss function seems to be different from the...
Bumps [numpy](https://github.com/numpy/numpy) from 1.19.5 to 1.22.0. Release notes Sourced from numpy's releases. v1.22.0 NumPy 1.22.0 Release Notes NumPy 1.22.0 is a big release featuring the work of 153 contributors spread...
Hi, Thanks for your interesting paper and this repository! I tried train ETM on both 20ng and my own dataset with num_topics = 50. Among the 50 topics I found...
Hello, Is it possible to know how this data is like: "raw/new_york_times_text/nyt_docs.txt"? I am trying to fit my own dataset but don't know to which type should I transform ......
Hi, I tried to run this ETM on my own dataset. The embedding part is quite hard to understand. I managed to use skipgram.py to train a word2vec model and...
Hi, thanks for your wonderful job. But I encounter confusion about the data loader function. Detail as below: ``` python parser.add_argument('--data_path', type=str, default='data/20ng', help='directory containing data') ``` 1. I can't...
Hi all. I have an issue understanding the `read_embedding_matrix` [used in](https://github.com/adjidieng/ETM/blob/81b839f16d3c168ac2c5c22daff18d924962c075/main.py#L100) main.py. The `model_path` [here](https://github.com/adjidieng/ETM/blob/81b839f16d3c168ac2c5c22daff18d924962c075/data.py#L77) is specific to the local path organization from the authors but it is not clear...
Bumps [pillow](https://github.com/python-pillow/Pillow) from 8.1.0 to 9.0.1. Release notes Sourced from pillow's releases. 9.0.1 https://pillow.readthedocs.io/en/stable/releasenotes/9.0.1.html Changes In show_file, use os.remove to remove temporary images. CVE-2022-24303 #6010 [@radarhere, @hugovk] Restrict builtins within...
Why are they dividing by 45 for topic coherence based on normalised PMI? It says in the paper but the computation in the code looks different to me.
Hi, I tried to run this ETM on my own dataset. I managed to use data_nyt.py to generate a number of .mat files but failed to use it in the...