mlx-examples icon indicating copy to clipboard operation
mlx-examples copied to clipboard

Enable more BERT models

Open yzimmermann opened this issue 1 year ago • 3 comments

I changed some things that were hard coded before that can be retrieved from config.json to enable more models with the BertForMaskedLM architecture. I tested it with some other models, e.g.

python convert.py --bert-model "unitary/toxic-bert" --mlx-model weights/toxic-bert.npz
python test.py --bert-model "unitary/toxic-bert" --mlx-model "weights/toxic-bert.npz" --text "Hello World!"

Tests pass :)

It should in theory also work with RoBERTa models, but I haven't quite gotten it to run yet.

yzimmermann avatar Mar 14 '24 13:03 yzimmermann

Really nice! When you say RoBERTA is not working is it giving bad output or just crashing?

awni avatar Mar 14 '24 13:03 awni

Really nice! When you say RoBERTA is not working is it giving bad output or just crashing?

Bad output. I suspect it has something to do with token-type IDs (see my attempted fix).

yzimmermann avatar Mar 14 '24 14:03 yzimmermann

Upon reviewing the BERT and RoBERTa architectures it seems like we do need to tweak the model file a bit.

BERT (google-bert/bert-base-uncased) image

RoBERTa (FacebookAI/roberta-base) image

yzimmermann avatar Mar 14 '24 22:03 yzimmermann