Segmentation fault while converting Bert-base-uncased with README command
System Info
- transformers.js 7f5081da29c3f77ee830269ab801344776e61bcb
- Operating System Mac OS Sonoma 14.5
- Macbook M1
- Python 3.11.5
- transformers==4.33.2
- onnxruntime==1.15.1
- optimum==1.13.2
- onnx==1.13.1
Environment/Platform
- [ ] Website/web-app
- [X] Browser extension
- [ ] Server-side (e.g., Node.js, Deno, Bun)
- [ ] Desktop app (e.g., Electron)
- [ ] Other (e.g., VSCode extension)
Description
When I try to run the conversion script with the command python -m scripts.convert --quantize --model_id bert-base-uncased I get a segmentation fault.
(default) ➜ transformers.js-main python -m scripts.convert --quantize --model_id bert-base-uncased
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
/Users/maxfrax/opt/anaconda3/envs/default/lib/python3.11/site-packages/transformers/utils/generic.py:311: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
torch.utils._pytree._register_pytree_node(
Framework not specified. Using pt to export to ONNX.
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['cls.seq_relationship.weight', 'bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Automatic task detection to fill-mask (possible synonyms are: masked-lm).
[1] 6178 segmentation fault python -m scripts.convert --quantize --model_id bert-base-uncased
Do any of you have any pointer to pinpoint the root cause?
Reproduction
- Download the repository as a zip
- Extract the folder and move into it from the terminal
- Run the README command
python -m scripts.convert --quantize --model_id bert-base-uncased
Hi there 👋 Can you try in an environment like Google Colab? This might be an issue with your system/environment. Also, if you only want to use that model, you can use my pre-converted one here: https://huggingface.co/Xenova/bert-base-uncased
@xenova Thank you for sharing your pre-converted model. I was attempting the conversion myself to verify my environment.
I tested the conversion script on Azure using a Standard_NC4as_T4_v3 instance (4 cores, 28 GB RAM, 176 GB disk, T4 GPU), and it worked perfectly.
I should probably set up a fresh environment on my MacBook M1 to provide easy-to-reproduce steps, but given its age, it might not be worth the effort.