vecino icon indicating copy to clipboard operation
vecino copied to clipboard

vecino docker run seems failed before ENDING

Open billmetangmo opened this issue 8 years ago • 2 comments

Expected behavior

I expect that ouput of vecino docker similarities finding of Levis0045/MetaLex would be :

docker run -it --rm srcd/vecino https://github.com/Levis0045/MetaLex
                                    github-repo1	x.XX
                                    github-repo2	x.XX
                                    github-repo3	x.XX

Actual behavior

It seems to work fine at the beginning :

INFO:bblfsh:Detected bblfsh server: 172.17.0.1:9432
INFO:enry:Fetching https://api.github.com/repos/src-d/enry/releases/latest
INFO:enry:Latest release resolved to enry_v1.6.3_linux_amd64.tar.gz
INFO:enry:Fetching https://github.com/src-d/enry/releases/download/v1.6.3/enry_v1.6.3_linux_amd64.tar.gz
INFO:enry:Extracting the binary
INFO:enry:Downloaded /enry
INFO:gcs-backend:Fetching https://storage.googleapis.com/models.cdn.sourced.tech/index.json?ignoreCache=1...
INFO:gcs-backend:Fetching https://storage.googleapis.com/models.cdn.sourced.tech/models%2Fid2vec%2F92609e70-f79c-46b5-8419-55726e873cfc.asdf...
[################################] 17044/17044 - 00:12:06
INFO:id2vec:Reading /root/.source{d}/id2vec/default.asdf...
INFO:id2vec:Building the token index...
INFO:similar_repos:Loaded id2vec model: {'created_at': datetime.datetime(2017, 6, 18, 17, 37, 6, 255615),
 'dependencies': [],
 'model': 'id2vec',
 'uuid': '92609e70-f79c-46b5-8419-55726e873cfc',
 'version': [1, 0, 0]}
Shape: (999424, 300)
First 10 words: ['get', 'name', 'type', 'string', 'class', 'set', 'data', 'value', 'self', 'test']
INFO:gcs-backend:Fetching https://storage.googleapis.com/models.cdn.sourced.tech/index.json?ignoreCache=1...
INFO:gcs-backend:Fetching https://storage.googleapis.com/models.cdn.sourced.tech/models%2Fdocfreq%2Ff64bacd4-67fb-4c64-8382-399a8e7db52a.asdf...
[################################] 372/372 - 00:00:17
INFO:docfreq:Reading /root/.source{d}/docfreq/default.asdf...
INFO:docfreq:Building the docfreq dictionary...
INFO:docfreq:Pruning to min 20 occurrences
INFO:similar_repos:Loaded document frequencies: {'created_at': datetime.datetime(2017, 6, 19, 9, 59, 14, 766638),
 'dependencies': [],
 'model': 'docfreq',
 'uuid': 'f64bacd4-67fb-4c64-8382-399a8e7db52a',
 'version': [1, 0, 0]}
Number of words: 416370
First 10 words: ['aaa', 'aaaa', 'aaaaa', 'aaaaaa', 'aaaaaaa', 'aaaaaaaa', 'aaaaaaaaa', 'aaaaaaaaaa', 'aaaaaaaaaaa', 'aaaaaaaaaaaa']
Number of documents: 112273
INFO:gcs-backend:Fetching https://storage.googleapis.com/models.cdn.sourced.tech/index.json?ignoreCache=1...
INFO:gcs-backend:Fetching https://storage.googleapis.com/models.cdn.sourced.tech/models%2Fnbow%2F1e3da42a-28b6-4b33-94a2-a5671f4102f4.asdf...
[################################] 5672/5672 - 00:05:20
INFO:nbow:Reading /root/.source{d}/nbow/default.asdf...
INFO:nbow:Building the repository names mapping...
INFO:similar_repos:Loaded nBOW model: {'created_at': datetime.datetime(2017, 6, 19, 9, 16, 8, 942880),
 'dependencies': [{'created_at': datetime.datetime(2017, 6, 18, 17, 37, 6, 255615),
                   'dependencies': [],
                   'model': 'id2vec',
                   'uuid': '92609e70-f79c-46b5-8419-55726e873cfc',
                   'version': [1, 0, 0]},
                  {'created_at': datetime.datetime(2017, 6, 19, 9, 59, 14, 766638),
                   'dependencies': [],
                   'model': 'docfreq',
                   'uuid': 'f64bacd4-67fb-4c64-8382-399a8e7db52a',
                   'version': [1, 0, 0]}],
 'model': 'nbow',
 'uuid': '1e3da42a-28b6-4b33-94a2-a5671f4102f4',
 'version': [1, 0, 0]}
Shape: (112273, 999424)
First 10 repos: ['ikizir/HohhaDynamicXOR', 'ditesh/node-poplib', 'Code52/MarkPadRT', 'wp-shortcake/shortcake', 'capaj/Moonridge', 'HugoGiraudel/hugogiraudel.github.com', 'crosswalk-project/crosswalk-website', 'apache/parquet-mr', 'dciccale/kimbo.js', 'processone/oneteam']
INFO:bblfsh:Detected bblfsh server: 172.17.0.1:9432
INFO:similar_repos:Creating the WMD engine...
INFO:repo_cloner:Cloning from https://github.com/Levis0045/MetaLex...
INFO:repo_cloner:Finished cloning https://github.com/Levis0045/MetaLex
INFO:repo_cloner:Classifying the files...
INFO:repo_cloner:Result: {'HTML': 1, 'CSS': 1, 'Shell': 1, 'Python': 20, 'Text': 5}
INFO:repo2nbow:Fetching and processing UASTs...

Then start to fail:

ERROR:repo2nbow:bblfsh: RpcError on /tmp/repo2-vb2b74e1/Levis0045&MetaLex_github.com/metalex/api.py: <_Rendezvous of RPC that terminated with (StatusCode.UNAVAILABLE, Connect Failed)>
WARNING:repo2nbow:/tmp/repo2-vb2b74e1/Levis0045&MetaLex_github.com/metalex/api.py was skipped
ERROR:repo2nbow:bblfsh: RpcError on /tmp/repo2-vb2b74e1/Levis0045&MetaLex_github.com/metalex/logs/__init__.py: <_Rendezvous of RPC that terminated with (StatusCode.UNAVAILABLE, Connect Failed)>
WARNING:repo2nbow:/tmp/repo2-vb2b74e1/Levis0045&MetaLex_github.com/metalex/logs/__init__.py was skipped
INFO:repo2nbow:https://github.com/Levis0045/MetaLex pending tasks: 19
.........
INFO:repo2nbow:https://github.com/Levis0045/MetaLex pending tasks: 0
Traceback (most recent call last):
  File "/usr/local/bin/vecino", line 11, in <module>
    load_entry_point('vecino==0.1.6a0', 'console_scripts', 'vecino')()
  File "/usr/local/lib/python3.5/dist-packages/vecino/__main__.py", line 76, in main
    max_time=args.max_time, skipped_stop=args.skipped_stop)
  File "/usr/local/lib/python3.5/dist-packages/vecino/similar_repositories.py", line 80, in query
    neighbours = self._query_foreign(url_or_path_or_name, **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/vecino/similar_repositories.py", line 108, in _query_foreign
    return self._wmd.nearest_neighbors((words, weights), **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/wmd/__init__.py", line 507, in nearest_neighbors
    "Too little vocabulary for %s: %d" % (index, len(words)))
ValueError: Too little vocabulary for None: 0

Steps to reproduce the behavior

docker build -t srcd/vecino .
docker run -d --privileged -p 9432:9432 --name bblfshd bblfsh/bblfshd
docker exec -it bblfshd bblfshctl driver install --all
docker run -it --rm srcd/vecino https://github.com/Levis0045/MetaLex

Any advice ?

billmetangmo avatar Jan 16 '18 09:01 billmetangmo

Did every "pending tasks" message take a few seconds to appear? Looks like it cannot connect to Babelfish. The exception is the ends means nothing was extracted. Does this command work?

docker run -it --rm --entrypoint bash srcd/vecino
python3 -m bblfsh -f /usr/local/lib/python3.5/dist-packages/vecino/__main__.py

vmarkovtsev avatar Jan 16 '18 14:01 vmarkovtsev

No , the command doesn't work and I got the same error with addition of this line:

Failed to connect to the Docker daemon and ensure that the Babelfish server is running. Error while fetching server API version: ('Connection aborted.', FileNotFoundError(2, 'No such file or directory'))

The connection with babelfish server is done through the network or a unix socket ? Because, i tought i will get an Connection aborted from NIC .

billmetangmo avatar Jan 16 '18 16:01 billmetangmo