OCTIS icon indicating copy to clipboard operation
OCTIS copied to clipboard

The `python` and `scipy` version-compatibility, and KLDivergence() needs attention!

Open prikarsartam opened this issue 1 year ago • 1 comments

  • OCTIS version: 1.13.1
  • Python version: 3.12.0
  • Operating System: Linux-64

I am not an expert in package management so I do not fully understand all the details of it. octis installs properly in google colab, but installing in kaggle requires pip install octis --use-pep517.

Now installing locally on my system I had the following issue - both for installing with pip install octis and pip install -e. from the downloaded repository which is of prior concern to me.

Description

  1. Installing with the latest python3.12 in my linux doesn't end successfully in any case as zipimport has been deprecated from Python3.10 onwards.
  2. Since this repo requires gensim==4.2.0 it has image inside gensim/matutils.py but to the best of my knowledge the triu has been deprecated for scipy==1.13.0 onwards.
  3. Also the KLDivergence in octis.evaluation_metrics.diversity_metrics returns RuntimeWarning: invalid value encountered in log divergence = np.sum(P*np.log(P[/Q](http://localhost:8888/Q)))

What I Did

I made a conda virtual environment with python3.10 and downgraded scipy==1.12 : so prob 1 and 2 are solved.

For the case of 3 : the model_output['topic-word-matrix] for ProdLDA is not suitably normalized in [0,1] to be interpreted as probabilities which gives negative entries in the matrix leading to nan in np.log().

prikarsartam avatar Jun 08 '24 09:06 prikarsartam

You can try the new topic modeling toolkit TopMost.

nicepool6 avatar Aug 27 '24 00:08 nicepool6