"seek" error in calculating distances?
Hi, I'm getting the following error when I try to calculate distances. Not sure if this is a library compatibility problem?
$ python
Python 3.6.10 |Anaconda, Inc.| (default, Jan 7 2020, 15:01:53)
[GCC 4.2.1 Compatible Clang 4.0.1 (tags/RELEASE_401/final)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import lang2vec.lang2vec as l2v
>>> l2v.distance('syntactic', 'deu', 'eng')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/neubig/anaconda3/envs/python3/lib/python3.6/site-packages/lang2vec-1.1.6-py3.6.egg/lang2vec/lang2vec.py", line 401, in distance
File "/Users/neubig/anaconda3/envs/python3/lib/python3.6/site-packages/scipy/sparse/_matrix_io.py", line 131, in load_npz
with np.load(file, **PICKLE_KWARGS) as loaded:
File "/Users/neubig/anaconda3/envs/python3/lib/python3.6/site-packages/numpy/lib/npyio.py", line 439, in load
fid.seek(-min(N, len(magic)), 1) # back-up
io.UnsupportedOperation: seek
If I recall correctly, that's the error of trying to read a sparse matrix from a file that doesn't exist.
Have you downloaded the distances file?
wget http://www.cs.cmu.edu/~aanastas/files/distances.zip .
and move it to lang2vec/data.
(Also, using the distances needs installation from source, rather than from pip)
Ahh, thanks! Seems like it worked. I'll keep the issue open because it might be nice to have a less opaque error message and/or automatic download of the file.
Sounds good -- thanks!
Hello, I am encountering this same error, but this fix has not worked for me. I seem to have the required csv files. Has the distances code been updated since 2020?
The contents of my lang2vec/data directory are below.
distances2.zip FEATURAL.csv features.npz GEOGRAPHIC.csv learned.npy phonological_upper_sparse.npz
distances_languages.txt featural_upper_round1_sparse.npz GENETIC.csv geographic_upper_round1_sparse.npz letter_codes.json SYNTACTIC.csv
distances.zip feature_averages.npz genetic_upper_sparse.npz INVENTORY.csv __MACOSX syntactic_upper_round2_sparse.npz
family_features.npz feature_predictions.npz geocoord_features.npz inventory_upper_sparse.npz PHONOLOGICAL.csv
To resolve this I had to replace
data = sparse.load_npz(zp.open(map_distance_to_filename(dist)))
with
data_dir = '/'.join(DISTANCES_FILE.split('/')[:-1]) + '/'
data = sparse.load_npz(data_dir + map_distance_to_filename(dist))
on line 401 of lang2vec/lang2vec.py after unzipping lang2vec/data/distances2.zip. It seems to be working now.
It looks like this was a compatibility issue. When I upgraded python to 3.9, neither of these fixes were necessary.