datasets
datasets copied to clipboard
NonMatchingChecksumError for sun397
/!\ PLEASE INCLUDE THE FULL STACKTRACE AND CODE SNIPPET
Short description When
Environment information
-
Operating System:
-
Python version: <3..8>
-
tensorflowversion: <2.11.0> -
tensorflow-datasetsversion: <4.8.3> -
tf-nightlyversion: <4.8.3.dev202303190046> -
issue still exists with the last
tfds-nightlypackage
Reproduction instructions
import tensorflow_datasets as tfds
dataset_name = "sun397/tfds:4.*.*"
name = dataset_name.split(':')[0]
dataset_builder = tfds.builder(dataset_name, data_dir=f'data/vtab/{name}/tfrecord')
dataset_builder.download_and_prepare(
download_dir=f'data/vtab/{name}/raw',
download_config=tfds.download.DownloadConfig(
extract_dir=f'data/vtab/{name}/extracted',
# max_examples_per_split=0,
)
)
Link to logs
Traceback (most recent call last):
File "/data1/NOAH/data/vtab-source/task_adaptation/data/sun397.py", line 40, in __init__
dataset_builder.download_and_prepare(
……
File "/data1/anaconda3/envs/NOAH/lib/python3.8/site-packages/tensorflow_datasets/core/download/download_manager.py", line 807, in _validate_checksums
raise NonMatchingChecksumError(msg)
tensorflow_datasets.core.download.download_manager.NonMatchingChecksumError: Artifact https://vision.princeton.edu/projects/2010/SUN/SUN397.tar.gz, downloaded to data/vtab/sun397/tfds/raw/visio.princ.edu_proje_2010_SUN_SUN39YI7MRZf0CsLgBE9BMqMt4EoWzt0_oF_tQ9O6vET_wAc.tar.gz.tmp.aa94e417322b4f408a5f328d52c7e55c/SUN397.tar.gz, has wrong checksum:
* Expected: UrlInfo(size=36.39 GiB, checksum='f404130965a7ad77bed5ececc71e720ab50f3cc1e1bb257257610d38f3b928ec', filename='SUN397.tar.gz')
* Got: UrlInfo(size=294.59 MiB, checksum='de22a1d93ed230dd799d54281fad5356d36453cb0f6330b08139f035cb9e780c', filename='SUN397.tar.gz')
To debug, see: https://www.tensorflow.org/datasets/overview#fixing_nonmatchingchecksumerror
@M3Dade Hi, have you resolved the "wrong checksum" issue? Best,