MedCATtrainer icon indicating copy to clipboard operation
MedCATtrainer copied to clipboard

Error: [E050] when trying to load sample annotate entity project

Open yzfr6 opened this issue 2 years ago • 2 comments

I have a fresh install on an Ubuntu 22.04 server. When selecting an example project (or one I have created), when I click an entity on the left the following error is displayed:

Error: [E050] Can't find model 'en_core_sci_lg'. It doesn't seem to be a Python package or a valid path to a data directory.

Full Error:

Traceback (most recent call last):
  File "/home/api/api/views.py", line 259, in prepare_documents
    cat = get_medcat(CDB_MAP=CDB_MAP, VOCAB_MAP=VOCAB_MAP,
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/api/api/utils.py", line 303, in get_medcat
    cat = CAT(cdb=cdb, config=cdb.config, vocab=vocab)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/medcat/cat.py", line 104, in __init__
    self._create_pipeline(self.config)
  File "/usr/local/lib/python3.11/site-packages/medcat/cat.py", line 111, in _create_pipeline
    self.pipe = Pipe(tokenizer=spacy_split_all, config=config)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/medcat/pipe.py", line 41, in __init__
    self._nlp = spacy.load(config.general.spacy_model, disable=config.general.spacy_disabled_components)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/spacy/__init__.py", line 51, in load
    return util.load_model(
           ^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/spacy/util.py", line 472, in load_model
    raise IOError(Errors.E050.format(name=name))
OSError: [E050] Can't find model 'en_core_sci_lg'. It doesn't seem to be a Python package or a valid path to a data directory.

Anyone know what the problem might be?

yzfr6 avatar Dec 06 '23 18:12 yzfr6

hi @yzfr6 - by deafult later versions of trainer don't have en_core_sci_lg installed. You can either adapt the model to use en_core_web_md, which is installed or drop into the running container, install the missing dep and the model will load up. Which model are you trying to load up?

tomolopolis avatar Dec 08 '23 16:12 tomolopolis

hi @tomolopolis - thanks for the info. At present I am just trying to load one of example projects to see how the tool works, so I wanted to use whatever was already installed in the package.

Would you be able to provide instructions on how to adapt the model to use en_core_web_md?

I added the following to the install section of webapp/Dockerfile and re run docker compose up but I was still met with the same error so im not sure how to do it

yzfr6 avatar Dec 10 '23 12:12 yzfr6