datasets icon indicating copy to clipboard operation
datasets copied to clipboard

I can't download datasets in China

Open tengwang0318 opened this issue 4 years ago • 5 comments

import tensorflow_datasets as tfds
examples, metadata = tfds.load(name='ted_hrlr_translate/pt_to_en',
                               with_info=True,as_supervised=True)

error report:

2021-03-28 10:47:44.478212: W tensorflow/core/platform/cloud/google_auth_provider.cc:184] All attempts to get a Google authentication bearer token failed, returning an empty token. Retrieving token from files failed with "Not found: Could not locate the credentials file.". Retrieving token from GCE failed with "Failed precondition: Error executing an HTTP request: libcurl code 6 meaning 'Couldn't resolve host name', error details: Couldn't resolve host 'metadata'".
2021-03-28 10:48:45.489205: E tensorflow/core/platform/cloud/curl_http_request.cc:614] The transmission  of request 0x112492b40 (URI: https://www.googleapis.com/storage/v1/b/tfds-data/o/dataset_info%2Fted_hrlr_translate%2Fpt_to_en%2F1.0.0?fields=size%2Cgeneration%2Cupdated) has been stuck at 0 of 0 bytes for 61 seconds and will be aborted. CURL timing information: lookup time: 0.001735 (No error), connect time: 0 (No error), pre-transfer time: 0 (No error), start-transfer time: 0 (No error)

json file:

{
    "error": {
        "code": 404,
        "message": "No such object: tfds-data/dataset_info/ted_hrlr_translate/pt_to_en/1.0.0",
        "errors": [
            {
                "message": "No such object: tfds-data/dataset_info/ted_hrlr_translate/pt_to_en/1.0.0",
                "domain": "global",
                "reason": "notFound"
            }
        ]
    }
}

I try to use VPN, but it doesn't work.

tengwang0318 avatar Mar 28 '21 02:03 tengwang0318

I have the same problem!

michael-wzhu avatar Apr 02 '21 02:04 michael-wzhu

What is the full stacktrace ?

Conchylicultor avatar Apr 06 '21 08:04 Conchylicultor

What is the full stacktrace ?

there is no stacktrace. the bug is from a Google authentication is there a Google authentication when load the data ? many coders get the same problem

ross-Hr avatar Oct 04 '21 10:10 ross-Hr

there is no stacktrace.

Does that mean the dataset is correctly generated and loaded ? TFDS tries to lookup on GCS to see if the dataset can be directly downloaded. If not (example: permission issue), it will just download the dataset from its original source (no GCS).

If there is no error, then I'm not sure what is the issue. What is not working ? Could you share the full logs when running:

import tensorflow_datasets as tfds

ds, info = tfds.load('replace_by_your_dataset', with_info=True)
print(info)

Conchylicultor avatar Oct 04 '21 16:10 Conchylicultor

I could, but the download process is extremely slow!

cockroachzl avatar Mar 25 '22 04:03 cockroachzl

hey, I am not sure if your problem was well solved, but u can execute the following command, to set up a temporary proxy for your current python env import os os.environ['HTTP_PROXY'] = 'http://IP:PORT' os.environ['HTTPS_PROXY'] = 'http://IP:PORT'

Gary-Zhang1104 avatar Nov 26 '22 12:11 Gary-Zhang1104

https://github.com/tensorflow/datasets/issues/3132#issuecomment-1328041858 Thanks a lot! Very useful to me!

JulietGZHU20230316 avatar Mar 14 '24 01:03 JulietGZHU20230316