Dataset with streaming doesn't work with proxy
Describe the bug
I'm currently trying to stream data using dataset since the dataset is too big but it hangs indefinitely without loading the first batch. I use AIMOS which is a supercomputer that uses proxy to connect to the internet. I assume it has to do with the network configurations. I've already set up both HTTP_PROXY and HTTPS_PROXY. streaming = False works fine.
Steps to reproduce the bug
use load_dataset with streaming = True in AIMOS
Expected behavior
does not hang indefinitely and loads batches to start training run
Environment info
_libgcc_mutex 0.1 conda_forge conda-forge _openmp_mutex 4.5 2_gnu conda-forge _pytorch_select 2.0 cuda_2 https://ftp.osuosl.org/pub/open-ce/1.10.0 abseil-cpp 20220623.0 h9888cd1_6 conda-forge absl-py 1.0.0 py311h399429b_0 https://ftp.osuosl.org/pub/open-ce/1.10.0 aiofiles 23.2.1 pyhd8ed1ab_0 conda-forge aiohttp 3.8.6 py311hf118e41_0 aiosignal 1.2.0 pyhd3eb1b0_0 archspec 0.2.3 pyhd8ed1ab_0 conda-forge arrow-cpp 11.0.0 ha3edaa6_5_cpu conda-forge async-timeout 4.0.2 py311h6ffa863_0 attrs 23.1.0 py311h6ffa863_0 av 10.0.0 py311he6153ed_2 https://ftp.osuosl.org/pub/open-ce/1.10.0 aws-c-auth 0.6.24 hb81f6d7_5 conda-forge aws-c-cal 0.5.20 h3c2b4d9_6 conda-forge aws-c-common 0.8.11 h4194056_0 conda-forge aws-c-compression 0.2.16 ha19333d_3 conda-forge aws-c-event-stream 0.2.18 h12a9399_6 conda-forge aws-c-http 0.7.4 ha2cde00_2 conda-forge aws-c-io 0.13.17 h9189062_2 conda-forge aws-c-mqtt 0.8.6 h40d1a04_6 conda-forge aws-c-s3 0.2.4 hbdbe4f0_3 conda-forge aws-c-sdkutils 0.1.7 ha19333d_3 conda-forge aws-checksums 0.1.14 ha19333d_3 conda-forge aws-crt-cpp 0.19.7 hd018011_7 conda-forge aws-sdk-cpp 1.10.57 hb9575ba_4 conda-forge blas 1.0 openblas blinker 1.8.2 pyhd8ed1ab_0 conda-forge boltons 23.0.0 py311h6ffa863_0 boost-cpp 1.82.0 h25e6d66_2 bottleneck 1.3.5 py311h34f6284_0 brotli 1.0.9 hf118e41_7 brotli-bin 1.0.9 hf118e41_7 brotli-python 1.0.9 py311h4a02239_7 bzip2 1.0.8 h7b6447c_0 c-ares 1.19.1 hf118e41_0 ca-certificates 2024.6.2 h0f6029e_0 conda-forge cachetools 5.3.3 pyhd8ed1ab_0 conda-forge certifi 2024.6.2 pyhd8ed1ab_0 conda-forge cffi 1.15.1 py311hf118e41_3 charset-normalizer 2.0.4 pyhd3eb1b0_0 click 8.1.7 unix_pyh707e725_0 conda-forge conda 24.5.0 py311h1af927a_0 conda-forge conda-content-trust 0.2.0 py311h6ffa863_0 conda-libmamba-solver 23.11.1 py311h6ffa863_0 conda-package-handling 2.2.0 py311h6ffa863_0 conda-package-streaming 0.9.0 py311h6ffa863_0 contourpy 1.0.5 py311h25e6d66_0 cryptography 41.0.3 py311hb0e80e7_0 cudatoolkit 11.8.0 hedcfb66_13 conda-forge cudnn 8.9.2_11.8 h9ceb136_1 https://ftp.osuosl.org/pub/open-ce/1.10.0 cycler 0.11.0 pyhd3eb1b0_0 datasets 2.12.0 py311h6ffa863_0 dill 0.3.6 py311h6ffa863_0 distro 1.9.0 pyhd8ed1ab_0 conda-forge ffmpeg 4.2.2 opence_0 https://ftp.osuosl.org/pub/open-ce/1.10.0 filelock 3.9.0 py311h6ffa863_0 fmt 9.1.0 h25e6d66_0 fonttools 4.25.0 pyhd3eb1b0_0 freetype 2.12.1 hd23a775_0 frozendict 2.4.4 py311hb02d432_0 conda-forge frozenlist 1.4.0 py311hf118e41_0 fsspec 2023.9.2 py311h6ffa863_0 gflags 2.2.2 he6710b0_0 giflib 5.2.1 hf118e41_3 glog 0.6.0 hbe088e0_0 conda-forge gmp 6.3.0 h46f38da_0 conda-forge gmpy2 2.1.5 py311h2758da7_1 conda-forge google-auth 2.30.0 pyhff2d567_0 conda-forge google-auth-oauthlib 0.5.3 pyhd8ed1ab_0 conda-forge grpc-cpp 1.51.1 h8ba971d_1 conda-forge grpcio 1.54.3 py311h414e0d3_0 https://ftp.osuosl.org/pub/open-ce/1.10.0 huggingface_hub 0.17.3 py311h6ffa863_0 icu 73.1 h4a02239_0 idna 3.4 py311h6ffa863_0 importlib-metadata 6.0.0 py311h6ffa863_0 jinja2 3.1.4 pyhd8ed1ab_0 conda-forge jpeg 9e hf118e41_1 jsonpatch 1.32 pyhd3eb1b0_0 jsonpointer 2.1 pyhd3eb1b0_0 kiwisolver 1.4.4 py311h4a02239_0 krb5 1.20.1 hc019ccd_1 lame 3.100 hb283c62_1003 conda-forge lcms2 2.12 h2045e0b_0 ld_impl_linux-ppc64le 2.38 hec883e6_1 lerc 3.0 h29c3540_0 leveldb 1.23 h24532b4_1 conda-forge libabseil 20220623.0 cxx17_h9235812_6 conda-forge libarchive 3.6.2 hd8ab008_2 libarrow 11.0.0 h837770b_5_cpu conda-forge libboost 1.82.0 haf51a6a_2 libbrotlicommon 1.0.9 hf118e41_7 libbrotlidec 1.0.9 hf118e41_7 libbrotlienc 1.0.9 hf118e41_7 libcrc32c 1.1.2 h3b9df90_0 conda-forge libcurl 8.4.0 h4d62439_0 libdeflate 1.17 hf118e41_1 libedit 3.1.20221030 hf118e41_0 libev 4.33 h140841e_1 libevent 2.1.10 h19c23f1_4 conda-forge libexpat 2.6.2 h46f38da_0 conda-forge libffi 3.4.4 h4a02239_0 libgcc-ng 13.2.0 h31e42bb_10 conda-forge libgfortran-ng 11.2.0 hb3889a9_1 libgfortran5 11.2.0 h1234567_1 libgomp 13.2.0 h31e42bb_10 conda-forge libgoogle-cloud 2.7.0 h11140b6_1 conda-forge libgrpc 1.51.1 h4d29a31_1 conda-forge libmamba 1.5.3 h7c6fafd_0 libmambapy 1.5.3 py311h828bf7b_0 libnghttp2 1.57.0 h44e5816_0 libnsl 2.0.1 ha17a0cc_0 conda-forge libopenblas 0.3.23 hc5a31fb_2 https://ftp.osuosl.org/pub/open-ce/1.10.0 libopus 1.3.1 h4e0d66e_1 conda-forge libpng 1.6.39 hf118e41_0 libprotobuf 3.21.12 h1776448_0 https://ftp.osuosl.org/pub/open-ce/1.10.0 libsolv 0.7.24 h0f529ac_0 libsqlite 3.45.3 hd4bbf49_0 conda-forge libssh2 1.10.0 h50fa78f_2 libstdcxx-ng 13.2.0 h262982c_10 conda-forge libthrift 0.18.0 h82f1162_0 conda-forge libtiff 4.5.1 h4a02239_0 libutf8proc 2.8.0 hb283c62_0 conda-forge libuuid 2.38.1 h4194056_0 conda-forge libvpx 1.13.1 h46f38da_0 conda-forge libwebp 1.3.2 h0f96ee2_0 libwebp-base 1.3.2 hf118e41_0 libxcrypt 4.4.36 ha17a0cc_1 conda-forge libxml2 2.10.4 h18e3229_1 libzlib 1.2.13 h1f2b957_6 conda-forge llvm-openmp 14.0.6 hc028133_0 https://ftp.osuosl.org/pub/open-ce/1.10.0 lmdb 0.9.31 ha17a0cc_1 conda-forge lz4-c 1.9.4 h4a02239_0 markdown 3.4.4 pyhd8ed1ab_0 conda-forge markupsafe 2.1.5 py311h32d8acf_0 conda-forge matplotlib 3.8.0 py311h6ffa863_0 matplotlib-base 3.8.0 py311h52e1fcc_0 menuinst 2.1.1 py311h1af927a_0 conda-forge mpc 1.3.1 heaf1863_0 conda-forge mpfr 4.2.1 haad2271_1 conda-forge mpmath 1.3.0 pyhd8ed1ab_0 conda-forge multidict 6.0.2 py311hf118e41_0 multiprocess 0.70.14 py311h6ffa863_0 munkres 1.1.4 py_0 mypy_extensions 1.0.0 pyha770c72_0 conda-forge nccl 2.18.3 cuda11.8_1 https://ftp.osuosl.org/pub/open-ce/1.10.0 ncurses 6.4 h4a02239_0 nest-asyncio 1.6.0 pyhd8ed1ab_0 conda-forge networkx 2.8.8 pyhd8ed1ab_0 conda-forge nomkl 3.0 0 https://ftp.osuosl.org/pub/open-ce/1.10.0 numactl 2.0.16 hba61f60_1 https://ftp.osuosl.org/pub/open-ce/1.10.0 numexpr 2.8.7 py311hc46fc55_0 numpy 1.24.3 py311h148a09e_0 numpy-base 1.24.3 py311h06b82f6_0 oauthlib 3.2.2 pyhd8ed1ab_0 conda-forge openjpeg 2.4.0 hfe35807_0 openssl 3.3.1 h1f2b957_0 conda-forge orc 1.8.2 h341c9a4_2 conda-forge packaging 23.1 py311h6ffa863_0 pandas 2.1.1 py311h52e1fcc_0 pcre2 10.42 h280155c_0 pillow 10.0.1 py311he33076b_0 pip 23.3 py311h6ffa863_0 platformdirs 4.2.2 pyhd8ed1ab_0 conda-forge pluggy 1.0.0 py311h6ffa863_1 pooch 1.8.2 pyhd8ed1ab_0 conda-forge protobuf 4.21.12 py311ha7baec7_1 https://ftp.osuosl.org/pub/open-ce/1.10.0 psutil 5.9.8 py311hd26027c_0 conda-forge pyarrow 11.0.0 py311h04a18d5_1 pyasn1 0.6.0 pyhd8ed1ab_0 conda-forge pyasn1-modules 0.4.0 pyhd8ed1ab_0 conda-forge pybind11-abi 4 hd3eb1b0_1 pycosat 0.6.6 py311hf118e41_0 pycparser 2.21 pyhd3eb1b0_0 pyjwt 2.8.0 pyhd8ed1ab_1 conda-forge pyopenssl 23.2.0 py311h6ffa863_0 pyparsing 3.0.9 py311h6ffa863_0 pyre-extensions 0.0.30 pyhd8ed1ab_0 conda-forge pysocks 1.7.1 py311h6ffa863_0 python 3.11.8 h3332dee_0_cpython conda-forge python-dateutil 2.8.2 pyhd3eb1b0_0 python-tzdata 2023.3 pyhd3eb1b0_0 python-xxhash 2.0.2 py311hf118e41_1 python_abi 3.11 4_cp311 conda-forge pytorch 2.0.1 cuda11.8_py311_1 https://ftp.osuosl.org/pub/open-ce/1.10.0 pytorch-base 2.0.1 cuda11.8_py311_pb4.21.12_4 https://ftp.osuosl.org/pub/open-ce/1.10.0 pytz 2023.3.post1 py311h6ffa863_0 pyu2f 0.1.5 pyhd8ed1ab_0 conda-forge pyyaml 6.0.1 py311hf118e41_0 re2 2023.02.01 h883269e_0 conda-forge readline 8.2 hf118e41_0 regex 2023.10.3 py311hf118e41_0 reproc 14.2.4 h29c3540_1 reproc-cpp 14.2.4 h29c3540_1 requests 2.31.0 py311h6ffa863_0 requests-oauthlib 2.0.0 pyhd8ed1ab_0 conda-forge responses 0.13.3 pyhd3eb1b0_0 rsa 4.9 pyhd8ed1ab_0 conda-forge ruamel.yaml 0.17.21 py311hf118e41_0 s2n 1.3.37 h5e47323_0 conda-forge safetensors 0.4.0 py311hda16d9e_0 scipy 1.11.1 py311hd69e9bb_0 https://ftp.osuosl.org/pub/open-ce/1.10.0 sentencepiece 0.1.97 h1e74c73_py311_pb4.21.12_2 https://ftp.osuosl.org/pub/open-ce/1.10.0 setuptools 68.0.0 py311h6ffa863_0 six 1.16.0 pyhd3eb1b0_1 snappy 1.1.9 h29c3540_0 sqlite 3.41.2 hf118e41_0 sympy 1.12.1 pypyh2585a3b_103 conda-forge tabulate 0.8.10 pyhd8ed1ab_0 conda-forge tensorboard 2.13.0 pyhab0730d_pb4.21.12_1 https://ftp.osuosl.org/pub/open-ce/1.10.0 tensorboard-data-server 0.7.0 pyh6f84499_1 https://ftp.osuosl.org/pub/open-ce/1.10.0 tensorboard-plugin-wit 1.6.0 pyh9f0ad1d_0 conda-forge tk 8.6.13 hd4bbf49_0 conda-forge tokenizers 0.13.3 py311h3d4f45a_0 torchdata 0.6.0 py311_2 https://ftp.osuosl.org/pub/open-ce/1.10.0 torchsnapshot 0.1.0 pyhd8ed1ab_0 conda-forge torchtext-base 0.15.2 cuda11.8_py311_1 https://ftp.osuosl.org/pub/open-ce/1.10.0 torchtnt 0.2.4 pyhd8ed1ab_0 conda-forge torchvision-base 0.15.2 cuda11.8_py311_1 https://ftp.osuosl.org/pub/open-ce/1.10.0 tornado 6.3.3 py311hf118e41_0 tqdm 4.65.0 py311h7837921_0 transformers 4.32.1 py311h6ffa863_0 truststore 0.8.0 py311h6ffa863_0 typing-extensions 4.7.1 py311h6ffa863_0 typing_extensions 4.7.1 py311h6ffa863_0 typing_inspect 0.9.0 pyhd8ed1ab_0 conda-forge tzdata 2023c h04d1e81_0 urllib3 1.26.18 py311h6ffa863_0 utf8proc 2.6.1 h140841e_0 werkzeug 2.3.8 pyhd8ed1ab_0 conda-forge wheel 0.41.2 py311h6ffa863_0 xxhash 0.8.0 h140841e_3 xz 5.4.2 hf118e41_0 yaml 0.2.5 h7b6447c_0 yaml-cpp 0.8.0 h4a02239_0 yarl 1.8.1 py311hf118e41_0 zipp 3.11.0 py311h6ffa863_0 zlib 1.2.13 h1f2b957_6 conda-forge zstandard 0.19.0 py311hf118e41_0 zstd 1.5.5 h57e4825_0
Hi ! can you try updating datasets and huggingface_hub ?
pip install -U datasets huggingface_hub