Broken import for Icelandic language data
How to reproduce the behaviour
It looks like importing language data for Icelandic is broken. E.g. to get stop words:
# This works
import spacy.lang.en.stop_words
spacy.lang.en.stop_words.STOP_WORDS
# Syntax error in import statement
import spacy.lang.is.stop_words
spacy.lang.is.stop_words.STOP_WORDS
Error:
>>> import spacy.lang.is.stop_words
File "<stdin>", line 1
import spacy.lang.is.stop_words
^
SyntaxError: invalid syntax
I have yet to test this on Spacy v3.0.
Your Environment
- spaCy version: 2.3.5
- Platform: Linux-5.8.0-53-generic-x86_64-with-glibc2.29
- Python version: 3.8.5
Could this be resolved by referring to the language data directory with the three-letter country code?
spacy/lang/is -> spacy/lang/isl
Thanks for the report! We'll have to find a workaround, indeed.
I'm a little surprised nobody's run into this before!
Another workaround for this case is to use importlib:
import importlib
lang_is = importlib.import_module("spacy.lang.is")
lang_is.stop_words.STOP_WORDS
In my project, I need to fetch stop words of all languages provided by spaCy, so I have to use the importlib way with f-string and did not run into this issue. Using three-letter code for only the Icelandic language (which has a two-letter ISO 639-1 code) would be inconsistent, or spaCy could use three-letter codes for all languages.