mlx-examples Enable more BERT models

I changed some things that were hard coded before that can be retrieved from config.json to enable more models with the BertForMaskedLM architecture. I tested it with some other models, e.g.

python convert.py --bert-model "unitary/toxic-bert" --mlx-model weights/toxic-bert.npz
python test.py --bert-model "unitary/toxic-bert" --mlx-model "weights/toxic-bert.npz" --text "Hello World!"

Tests pass :)

It should in theory also work with RoBERTa models, but I haven't quite gotten it to run yet.

Mar 14 '24 13:03 yzimmermann

Really nice! When you say RoBERTA is not working is it giving bad output or just crashing?

Mar 14 '24 13:03 awni

Really nice! When you say RoBERTA is not working is it giving bad output or just crashing?

Bad output. I suspect it has something to do with token-type IDs (see my attempted fix).

Mar 14 '24 14:03 yzimmermann

Upon reviewing the BERT and RoBERTa architectures it seems like we do need to tweak the model file a bit.

BERT (google-bert/bert-base-uncased)

RoBERTa (FacebookAI/roberta-base)

Mar 14 '24 22:03 yzimmermann