PolyglotDB icon indicating copy to clipboard operation
PolyglotDB copied to clipboard

tutorial 1 reset needed on fresh install

Open msonderegger opened this issue 8 months ago • 1 comments

When I did a fresh install with conda-forge + pip then tried to run tutorial1.py after downloading the test corpus and putting in data/, I got this error (after "Parsing types from file" commands successfully executed):


Parsing files...
0 166
Parsing file 1 of 166 (61-70970-0007)...
0
Traceback (most recent call last):
  File "/Users/morgan/polyglotdb_new/tutorial_1.py", line 16, in <module>
    c.load(parser, corpus_root)
  File "/Users/morgan/miniconda3/envs/polyglotdb/lib/python3.12/site-packages/polyglotdb/corpus/importable.py", line 160, in load
    could_not_parse = self.load_directory(parser, path)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/morgan/miniconda3/envs/polyglotdb/lib/python3.12/site-packages/polyglotdb/corpus/importable.py", line 287, in load_directory
    self.add_discourse(data)
  File "/Users/morgan/miniconda3/envs/polyglotdb/lib/python3.12/site-packages/polyglotdb/corpus/importable.py", line 112, in add_discourse
    raise (ParseError('The discourse \'{}\' already exists in this corpus.'.format(data.name)))
polyglotdb.exceptions.ParseError: The discourse '61-70970-0007' already exists in this corpus.

In order to get the tutorial to work, I had to reset the corpus, using tutorial_1_reset.py -- after which this worked without error:

python tutorial_1.py

There might be a bug here -- can anyone replicate?

Regardless, it should be put into Tutorial 1 that if you hit an error you may need to reset the corpus and can use tutorial_1_reset.py (resetting is currently referred to, but in a confusing way that doesn't make clear that you can just run a script).

msonderegger avatar May 29 '25 15:05 msonderegger

PS: this is on Apple M3 Pro Mac OS 14.1 (23B2073)

msonderegger avatar May 29 '25 15:05 msonderegger