Enable more BERT models
I changed some things that were hard coded before that can be retrieved from config.json to enable more models with the BertForMaskedLM architecture. I tested it with some other models, e.g.
python convert.py --bert-model "unitary/toxic-bert" --mlx-model weights/toxic-bert.npz
python test.py --bert-model "unitary/toxic-bert" --mlx-model "weights/toxic-bert.npz" --text "Hello World!"
Tests pass :)
It should in theory also work with RoBERTa models, but I haven't quite gotten it to run yet.
Really nice! When you say RoBERTA is not working is it giving bad output or just crashing?
Really nice! When you say RoBERTA is not working is it giving bad output or just crashing?
Bad output. I suspect it has something to do with token-type IDs (see my attempted fix).
Upon reviewing the BERT and RoBERTa architectures it seems like we do need to tweak the model file a bit.
BERT (google-bert/bert-base-uncased)
RoBERTa (FacebookAI/roberta-base)