topicModels
topicModels copied to clipboard
multiprocessing doesnot work properly
I have observed that multiprocessing does not work properly if the corpus is huge. Right now Im dealing with around 10k documents and setting n_jobs = 1 works fine. However, n_jobs = 4 makes Python force close. I'm working on a MacOSX Mavericks.
Hey, Can I know what's your feature size and number of topics? I test it on Macbook pro with OSX Maverick with 4GB ram and it works fine for me.
Here is my test setting:
- dataset: 20 news group (same as the one I use in
lda_example.py) - Parameters:
- n_topics = 100
- n_docs = 11k
- n_feature = 35K
- n_jobs = 4