datasets icon indicating copy to clipboard operation
datasets copied to clipboard

Load Sentiment140 failed with HTTP 404

Open rayk opened this issue 2 years ago • 1 comments

/!\ PLEASE INCLUDE THE FULL STACKTRACE AND CODE SNIPPET

Short description Loading the Stanford Sentiment 140 dataset (via Tensorflow data) fails within 404; it is not there.

Environment information

  • Operating System: MACOS
  • Python version: 3.11
tensorboard-data-server      0.7.2
tensorflow                   2.15.0
tensorflow-datasets          4.9.4
  • Does the issue still exists with the last tfds-nightly package (pip install --upgrade tfds-nightly) ? Appears so

Reproduction instructions

ds_builder = tds.builder('sentiment140')

Stack

File ~/Projects/rmap_repo/botty/.venv/lib/python3.11/site-packages/tensorflow_datasets/core/download/downloader.py:331, in _assert_status(response)
    329 """Ensure the URL response is 200."""
    330 if response.status_code != 200:
--> 331   raise download_utils_lib.DownloadError(
    332       'Failed to get url {}. HTTP code: {}.'.format(
    333           response.url, response.status_code
    334       )
    335   )

DownloadError: Failed to get url https://www.cs.stanford.edu/people/alecmgo/trainingandtestdata.zip. HTTP code: 404.

Expected behaviour What you expected to happen.

Additional context It looks like a back URL; who would move this without a redirect?

rayk avatar Feb 13 '24 23:02 rayk

Hi @rayk , thank you for contacting us regarding the Sentiment140 dataset.

I was wondering why TFDS is trying to download https://www.cs.stanford.edu/people/alecmgo/trainingandtestdata.zip while the URL on the builder code is http://cs.stanford.edu/people/alecmgo/trainingandtestdata.zip (without www)?

Have you by any chance modified your local copy of the code?

ccl-core avatar Feb 14 '24 15:02 ccl-core