estimating Bayesian MSM very slow
Hi,
Im trying to follow the tutorial 6 (http://emma-project.org/latest/tutorials/notebooks/06-expectations-and-observables.html#Dynamic/kinetic-experimental-observables) to calculate the Trp-flourescene auto-correlation.
When I run this code
%matplotlib inline
import matplotlib.pyplot as plt
import matplotlib as mpl
import numpy as np
import mdtraj as md
import pyemma
import pyemma.coordinates as coor
import numpy as np
import matplotlib.pyplot as plt
from pyemma.util.contexts import settings
from mdtraj import shrake_rupley, compute_rg
# Define the reader for loading trajectory data
torsions_feat = pyemma.coordinates.featurizer(pdb)
torsions_feat.add_backbone_torsions(cossin=True, periodic=False)
torsions_data = pyemma.coordinates.load(xtc, features=torsions_feat)
cluster = pyemma.coordinates.cluster_kmeans(torsions_data, k=50, max_iter=50)
dtrajs_concatenated = cluster.dtrajs[0]
print(dtrajs_concatenated)
its = pyemma.msm.its(
cluster.dtrajs, lags=[1, 2, 3, 5, 7, 10], nits=3, errors='bayes')
My jupyter notebook stays a lot of time saying estimating BayesianMSM 0% with no progress.
Is there anything that I missed on my code?
pip list
Package Version
--------------------------------- -----------
anyio 3.6.2
argon2-cffi 21.3.0
argon2-cffi-bindings 21.2.0
asttokens 2.1.0
astunparse 1.6.3
attrs 22.1.0
backcall 0.2.0
backports.functools-lru-cache 1.6.4
beautifulsoup4 4.11.1
bleach 5.0.1
brotlipy 0.7.0
cached-property 1.5.2
certifi 2022.9.24
cffi 1.15.1
charset-normalizer 2.1.1
colorama 0.4.6
contourpy 1.0.6
cryptography 38.0.3
cycler 0.11.0
debugpy 1.6.3
decorator 5.1.1
deeptime 0.4.3
defusedxml 0.7.1
dill 0.3.6
entrypoints 0.4
executing 1.2.0
fastjsonschema 2.16.2
flit_core 3.8.0
fonttools 4.38.0
h5py 3.7.0
humanfriendly 10.0
idna 3.4
importlib-metadata 5.0.0
importlib-resources 5.10.0
ipykernel 6.14.0
ipython 8.4.0
ipython-genutils 0.2.0
ipywidgets 8.0.2
jedi 0.18.1
Jinja2 3.1.2
joblib 1.2.0
jsonschema 4.17.0
jupyter_client 7.4.7
jupyter-contrib-core 0.4.0
jupyter-contrib-nbextensions 0.5.1
jupyter_core 5.0.0
jupyter-highlight-selected-word 0.2.0
jupyter-latex-envs 1.4.6
jupyter-nbextensions-configurator 0.4.1
jupyter-server 1.23.2
jupyterlab-pygments 0.2.2
jupyterlab-widgets 3.0.3
kiwisolver 1.4.4
lxml 4.9.1
MarkupSafe 2.1.1
matplotlib 3.6.2
matplotlib-inline 0.1.6
mdshare 0.4.2
mdtraj 1.9.7
mistune 2.0.4
multiprocess 0.70.14
munkres 1.1.4
nbclassic 0.4.8
nbclient 0.7.0
nbconvert 7.2.5
nbexamples 0.3.1
nbformat 5.7.0
nest-asyncio 1.5.6
nglview 3.0.3
notebook 6.5.2
notebook_shim 0.2.2
numexpr 2.8.3
numpy 1.23.4
packaging 21.3
pandas 1.5.1
pandocfilters 1.5.0
parso 0.8.3
pathos 0.3.0
pexpect 4.8.0
pickleshare 0.7.5
Pillow 9.2.0
pip 22.3.1
pkgutil_resolve_name 1.3.10
platformdirs 2.5.2
pox 0.3.2
ppft 1.7.6.6
progress-reporter 2.0
prometheus-client 0.15.0
prompt-toolkit 3.0.32
psutil 5.9.4
ptyprocess 0.7.0
pure-eval 0.2.2
pycparser 2.21
pyEMMA 2.5.12
Pygments 2.13.0
pyOpenSSL 22.1.0
pyparsing 3.0.9
pyrsistent 0.19.2
PySocks 1.7.1
python-dateutil 2.8.2
pytz 2022.6
PyYAML 6.0
pyzmq 24.0.1
requests 2.28.1
scikit-learn 1.1.3
scipy 1.9.3
Send2Trash 1.8.0
setuptools 65.5.1
six 1.16.0
sniffio 1.3.0
soupsieve 2.3.2.post1
stack-data 0.6.1
tables 3.7.0
terminado 0.15.0
threadpoolctl 3.1.0
tinycss2 1.2.1
tornado 6.2
tqdm 4.64.1
traitlets 5.5.0
typing_extensions 4.4.0
unicodedata2 15.0.0
urllib3 1.26.11
wcwidth 0.2.5
webencodings 0.5.1
websocket-client 1.4.2
wheel 0.38.4
widgetsnbextension 4.0.3
zipp 3.10.0
Thanks in advance
Hi H-EKE,
I was struggling with the same issue. According to #1590, 1576, 1582 and especially 1553. This is a frequently encountered error which will probably be resolved out-of-the-box in future releases. For now 1553 its solution should resolve the problem. This means that you will have to add the n_jobs=1 argument to:
its = pyemma.msm.its(cluster.dtrajs, lags=50, nits=10, errors='bayes') pyemma.plots.plot_implied_timescales(its, units='ns', dt=0.1);
The above lines of code belong to the notebook 00-pentapeptide-showcase.ipnyb. This should give you:
its = pyemma.msm.its(cluster.dtrajs, lags=50, nits=10, errors='bayes', n_jobs=1) pyemma.plots.plot_implied_timescales(its, units='ns', dt=0.1);
I hope this also resolves your issue. All credit is due to the contributors of the previous topics.
UPDATE
An even more elegant solution would be to follow yuxuanzhuang his solution from #1590. According to this post you should update estimator.py at line number 346. estimator.py is located in the _base directory of your pyemma directory. Hereby:
pool = get_context("spawn").Pool(processes=n_jobs)
should be replaced with:
pool = get_context().Pool(processes=n_jobs)
I have tested this change with multiple notebooks and so far it seems to work excellent. For pyemma.msm.its() instances; the n_jobs=1 workaround frequently has got to be used. Using only the n_jobs=1 workaround still gave me freezing issues in notebook 03 - MSM estimation and validation.
Kind regards, and all credit is due to the contributors of the previous topics.
Dear @D16ERG
Thank you so much for your help and tips!
Dear @D16ERG
Thank you so much for your help and tips!
Glad to help! I know how frustrating unresolved issues can be if such steps are part of a larger project! Please do note however that there still seems to be an unresolved issue in notebook 07. These issues are described in #1604 and #1610.
Update: (Could you also please close this issue?)