evaluate Unable to use BLEURT in offline mode

Describe the bug

Trying to use BLEURT in offline mode fails. The script and model weights are cached to disk fine (when in online mode). In offline mode, it loads the script from the cache fine, but when trying to load the cached model weights, it throws an error.

I looks like the bug exists somewhere in the get_from_cache function, as the error is thrown from here:

https://github.com/huggingface/datasets/blob/f96547708a889c09ca8a02ed7aadd8c5690503c5/src/datasets/utils/file_utils.py#L530

I know the metrics within datasets are deprecated. However, this exact error is thrown by evaluate as well.

Steps to reproduce the bug

Steps to reproduce the behaviour:

from datasets import load_metric

import os
os.environ["HF_DATASETS_OFFLINE"] = "1"

bleurt = load_metric("bleurt")

Gives the following error:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/johnmg/mediqa/lib/python3.10/site-packages/datasets/utils/deprecation_utils.py", line 46, in wrapper
    return deprecated_function(*args, **kwargs)
  File "/home/johnmg/mediqa/lib/python3.10/site-packages/datasets/load.py", line 1397, in load_metric
    metric.download_and_prepare(download_config=download_config)
  File "/home/johnmg/mediqa/lib/python3.10/site-packages/datasets/metric.py", line 625, in download_and_prepare
    self._download_and_prepare(dl_manager)
  File "/home/johnmg/.cache/huggingface/modules/datasets_modules/metrics/bleurt/89f7c298fa543e9cee6749e6ed198069d7c10fc8e99c0ff37a843dbc0eea88d7/bleurt.py", line 117, in _download_and_prepare
    model_path = dl_manager.download_and_extract(CHECKPOINT_URLS[checkpoint_name])
  File "/home/johnmg/mediqa/lib/python3.10/site-packages/datasets/download/download_manager.py", line 564, in download_and_extract
    return self.extract(self.download(url_or_urls))
  File "/home/johnmg/mediqa/lib/python3.10/site-packages/datasets/download/download_manager.py", line 427, in download
    downloaded_path_or_paths = map_nested(
  File "/home/johnmg/mediqa/lib/python3.10/site-packages/datasets/utils/py_utils.py", line 436, in map_nested
    return function(data_struct)
  File "/home/johnmg/mediqa/lib/python3.10/site-packages/datasets/download/download_manager.py", line 453, in _download
    return cached_path(url_or_filename, download_config=download_config)
  File "/home/johnmg/mediqa/lib/python3.10/site-packages/datasets/utils/file_utils.py", line 182, in cached_path
    output_path = get_from_cache(
  File "/home/johnmg/mediqa/lib/python3.10/site-packages/datasets/utils/file_utils.py", line 530, in get_from_cache
    _raise_if_offline_mode_is_enabled(f"Tried to reach {url}")
  File "/home/johnmg/mediqa/lib/python3.10/site-packages/datasets/utils/file_utils.py", line 260, in _raise_if_offline_mode_is_enabled
    raise OfflineModeIsEnabled(
datasets.utils.file_utils.OfflineModeIsEnabled: Offline mode is enabled. Tried to reach https://storage.googleapis.com/bleurt-oss/bleurt-base-128.zip

Expected behavior

I would expect that, after loading the metric as bleurt = load_metric("bleurt") with an internet connection it will be cached locally, and I should be able to load it from this cache without an internet connection afterwards. I also considered manually specifying the cached model filepath like so:

bleurt = load_metric("bleurt", "/home/johnmg/.cache/huggingface/metrics/bleurt/default/downloads/extracted/4686726448df12b97ad0880ca1f80735f419854eb56f1878cf550dcbd717fb20/bleurt-base-128")

but this doesn't work either:

KeyError: "/home/johnmg/.cache/huggingface/metrics/bleurt/default/downloads/extracted/4686726448df12b97ad0880ca1f80735f419854eb56f1878cf550dcbd717fb20/bleurt-base-128 model not found. You should supply the name of a model checkpoint for bleurt in dict_keys(['bleurt-tiny-128', 'bleurt-tiny-512', 'bleurt-base-128', 'bleurt-base-512', 'bleurt-large-128', 'bleurt-large-512', 'BLEURT-20-D3', 'BLEURT-20-D6', 'BLEURT-20-D12', 'BLEURT-20'])"

as the metric loading scripts expect the model checkpoint to be one of:

https://github.com/huggingface/datasets/blob/f96547708a889c09ca8a02ed7aadd8c5690503c5/metrics/bleurt/bleurt.py#L64-L75

Environment info

I installed datasets from main with pip install git+https://github.com/huggingface/datasets.git

Mar 01 '23 15:03 JohnGiorgi

Hi ! Metric related issues should be posted in the evaluate repository - happy to help from there ;)

Mar 01 '23 17:03 lhoestq

Could you try passing download_config=DownloadConfig(use_etag=False) to datasets.load_metric() or evaluate.load()?

You might have this issue because it tried to reach the URL to get the file ETag used by the cache.

Mar 02 '23 13:03 lhoestq

No dice, it seems. I tried the following, but it hung and eventually failed in offline mode.

While online:

import evaluate
from datasets import DownloadConfig
from transformers.utils import is_offline_mode

assert not is_offline_mode()
bleurt = evaluate.load("bleurt", "BLEURT-20")

While offline:

import evaluate
from datasets import DownloadConfig
from transformers.utils import is_offline_mode

assert is_offline_mode()
bleurt = evaluate.load("bleurt", "BLEURT-20", download_config=DownloadConfig(use_etag=False))

import evaluate
from datasets import DownloadConfig
from transformers.utils import is_offline_mode

import os
os.environ["HF_DATASETS_OFFLINE"] = "1"
os.environ["TRANSFORMERS_OFFLINE"] = "1"

assert is_offline_mode()
bleurt = evaluate.load("bleurt", "BLEURT-20", download_config=DownloadConfig(use_etag=False))

Any help would be appreciated @lhoestq 😅

Mar 02 '23 14:03 JohnGiorgi

Same here.

May 15 '23 11:05 guzy0324

Same here.

Dec 24 '23 12:12 zwhe99