Unable to load model in offline mode
Hello
I am unable to load a model in offline mode (i.e., from a local directory). Surprisingly, this works for the model urchade/gliner_multi but not for the model urchade/gliner_multi-v2.1. Other models have not been tested.
Error
The following error occurs:
Traceback (most recent call last):
File "/home/users/apeuvot/GliNER/evaluate.py", line 105, in <module>
model = load_model(options.model_path)
File "/home/users/apeuvot/GliNER/evaluate.py", line 13, in load_model
model = GLiNER.from_pretrained(path, local_files_only=True)
File "/home/users/apeuvot/miniconda3/envs/gliner/lib/python3.9/site-packages/huggingface_hub/utils/_validators.py", line 119, in _inner_fn
return fn(*args, **kwargs)
File "/home/users/apeuvot/miniconda3/envs/gliner/lib/python3.9/site-packages/huggingface_hub/hub_mixin.py", line 420, in from_pretrained
instance = cls._from_pretrained(
File "/home/users/apeuvot/miniconda3/envs/gliner/lib/python3.9/site-packages/gliner/model.py", line 409, in _from_pretrained
gliner = cls(config, tokenizer=tokenizer, encoder_from_pretrained=False,
File "/home/users/apeuvot/miniconda3/envs/gliner/lib/python3.9/site-packages/gliner/model.py", line 38, in __init__
tokenizer = AutoTokenizer.from_pretrained(config.model_name,
File "/home/users/apeuvot/miniconda3/envs/gliner/lib/python3.9/site-packages/transformers/models/auto/tokenization_auto.py", line 794, in from_pretrained
config = AutoConfig.from_pretrained(
File "/home/users/apeuvot/miniconda3/envs/gliner/lib/python3.9/site-packages/transformers/models/auto/configuration_auto.py", line 1138, in from_pretrained
config_dict, unused_kwargs = PretrainedConfig.get_config_dict(pretrained_model_name_or_path, **kwargs)
File "/home/users/apeuvot/miniconda3/envs/gliner/lib/python3.9/site-packages/transformers/configuration_utils.py", line 631, in get_config_dict
config_dict, kwargs = cls._get_config_dict(pretrained_model_name_or_path, **kwargs)
File "/home/users/apeuvot/miniconda3/envs/gliner/lib/python3.9/site-packages/transformers/configuration_utils.py", line 686, in _get_config_dict
resolved_config_file = cached_file(
File "/home/users/apeuvot/miniconda3/envs/gliner/lib/python3.9/site-packages/transformers/utils/hub.py", line 441, in cached_file
raise EnvironmentError(
OSError: We couldn't connect to 'https://huggingface.co' to load this file, couldn't find it in the cached files and it looks like microsoft/mdeberta-v3-base is not the path to a directory containing a file named config.json.
Checkout your internet connection or see how to run the library in offline mode at 'https://huggingface.co/docs/transformers/installation#offline-mode'.
Temporary Fix
Below is the modified code in gliner/model.py to allow offline mode functionality. The local_files_only argument was not propagated to all relevant functions for loading the model. While this solution might not be ideal as the process is different depending on the model (modification of config.model_name for urchade/gliner_multi-v2.1 but not for urchade/gliner_multi), it temporarily fixes the issue.
Modified Code
The following code shows the modifications:
class GLiNER(nn.Module, PyTorchModelHubMixin):
def __init__(self, config: GLiNERConfig,
model: Optional[Union[BaseModel, BaseORTModel]] = None,
tokenizer: Optional[Union[str, AutoTokenizer]] = None,
words_splitter: Optional[Union[str, WordsSplitter]] = None,
data_processor: Optional[Union[SpanProcessor, TokenProcessor]] = None,
encoder_from_pretrained: bool = True,
+ local_files_only: bool = False
):
super().__init__()
self.config = config
if tokenizer is None and data_processor is None:
- tokenizer = AutoTokenizer.from_pretrained(config.model_name)
+ tokenizer = AutoTokenizer.from_pretrained(config.model_name, local_files_only=local_files_only)
# Existing code...
@classmethod
def _from_pretrained(
cls,
*,
model_id: str,
revision: Optional[str],
cache_dir: Optional[Union[str, Path]],
force_download: bool,
proxies: Optional[Dict],
resume_download: bool,
local_files_only: bool,
token: Union[str, bool, None],
map_location: str = "cpu",
strict: bool = False,
load_tokenizer: Optional[bool]=False,
resize_token_embeddings: Optional[bool]=True,
load_onnx_model: Optional[bool]=False,
onnx_model_file: Optional[str] = 'model.onnx',
compile_torch_model: Optional[bool] = False,
**model_kwargs,
):
# Existing code...
if load_tokenizer:
- tokenizer = AutoTokenizer.from_pretrained(model_dir)
+ tokenizer = AutoTokenizer.from_pretrained(model_dir, local_files_only=local_files_only)
else:
tokenizer = None
config_ = json.load(open(config_file))
config = GLiNERConfig(**config_)
+ if local_files_only and config.model_name in ["microsoft/mdeberta-v3-base", "microsoft/deberta-v3-large"]: # for urchade/gliner_multi, it is already the local path
+ config.model_name = os.path.dirname(os.path.dirname(model_dir)) + "/" + config.model_name
add_tokens = ['[FLERT]', config.ent_token, config.sep_token]
if not load_onnx_model:
- gliner = cls(config, tokenizer=tokenizer, encoder_from_pretrained=False,
+ gliner = cls(config, tokenizer=tokenizer, encoder_from_pretrained=False, local_files_only=local_files_only)
# Existing code...
return gliner
Please consider addressing this issue in the next release to ensure better support for offline model loading.
Hi @apeuvotepf , we will definitely consider it in the next releases. Thank you for pointing out the importance of offline mode and your proposed temporary fix. To make it more general, we need to consider more aspects of the current realization, but it should be realized in the next release.
xref #108
Any update on this?
I love GLiNER but if I can't use it offline I can't use it at all :(
@baughmann The workaround in #108 should still work. Download both gliner and deberta and edit the gliner config to point to your local deberta model.
@baughmann The workaround in #108 should still work. Download both gliner and deberta and edit the gliner config to point to your local deberta model.
You're 100% right it does, and I've whipped up a script to automate it. Thank you for posting it, it's been quite helpful.
That said, it's still not the right answer, as we all know. I was hoping to see where this is on the radar because it was reported quite some time ago.
@baughmann The workaround in #108 should still work. Download both gliner and deberta and edit the gliner config to point to your local deberta model.
I do not think you need this anymore with more recent gliner version
you just need to load gliner online, use save_pretrained to get local weights