language icon indicating copy to clipboard operation
language copied to clipboard

Where I can get valid json?

Open AlexanderChaptykov opened this issue 6 years ago • 1 comments

I try run this for jsonl

python -m language.question_answering.bert_joint.prepare_nq_data \
  --logtostderr \
  --input_jsonl ~/data/nq-train-??.jsonl.gz \
  --output_tfrecord ~/output_dir/nq-train.tfrecords-00000-of-00001 \
  --max_seq_length=512 \
  --include_unknowns=0.02 \
  --vocab_file=bert-joint-baseline/vocab-nq.txt

but get error

Traceback (most recent call last):
  File "/home/achaptykov/anaconda3/envs/py36-test/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/home/achaptykov/anaconda3/envs/py36-test/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/data1/achaptykov/model/googlebert/language/question_answering/bert_joint/prepare_nq_data.py", line 90, in <module>
    tf.app.run()
  File "/data1/achaptykov/model/googlebert/env/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 125, in run
    _sys.exit(main(argv))
  File "/data1/achaptykov/model/googlebert/language/question_answering/bert_joint/prepare_nq_data.py", line 69, in main
    for example in get_examples(FLAGS.input_jsonl):
  File "/data1/achaptykov/model/googlebert/language/question_answering/bert_joint/prepare_nq_data.py", line 59, in get_examples
    for line in input_file:
  File "/home/achaptykov/anaconda3/envs/py36-test/lib/python3.6/gzip.py", line 374, in readline
    return self._buffer.readline(size)
  File "/home/achaptykov/anaconda3/envs/py36-test/lib/python3.6/_compression.py", line 68, in readinto
    data = self.read(len(byte_view))
  File "/home/achaptykov/anaconda3/envs/py36-test/lib/python3.6/gzip.py", line 463, in read
    if not self._read_gzip_header():
  File "/home/achaptykov/anaconda3/envs/py36-test/lib/python3.6/gzip.py", line 406, in _read_gzip_header
    magic = self._fp.read(2)
  File "/home/achaptykov/anaconda3/envs/py36-test/lib/python3.6/gzip.py", line 91, in read
    self.file.read(size-self._length+read)
TypeError: can't concat str to bytes

AlexanderChaptykov avatar Dec 15 '19 00:12 AlexanderChaptykov

I think you can get the data here: gsutil -m cp -R gs://natural_questions/v1.0

wmmxk avatar Dec 30 '19 19:12 wmmxk