cdQA icon indicating copy to clipboard operation
cdQA copied to clipboard

Is it possible to convert other bert model in pytorch form or tensorflow form to joblib for cdQA using?

Open txz233 opened this issue 6 years ago • 10 comments

Is it possible to convert other bert model in pytorch form or tensorflow form to joblib for cdQA using?

txz233 avatar Dec 13 '19 07:12 txz233

Do you have a pytorch version of a BertForQuestionAnswering model (HuggingFace's module version)?

If you have one, yes there is a way to do it, you would just need to do the folowing:

# Let's call your pytorch model Bert for QA custom_qa_model
from cdqa.reader import BertQA

reader = BertQA()
reader.model = custom_qa_model

# You need to send the model to CPU in order to save in the joblib format
reader.model.to('cpu')
reader.device = torch.device('cpu')

# Save it in the joblib format
import joblib
import os

joblib.dump(reader, os.path.join("path_to_directory", 'custom_qa_bert.joblib'))

andrelmfarias avatar Dec 18 '19 14:12 andrelmfarias

Thank you for your comment! @andrelmfarias ! But I don't really sure if my model is in HuggingFace's module version. Can you provide me more details about 'HuggingFace's module version'?

txz233 avatar Dec 19 '19 05:12 txz233

Your model should be an instance of the class BertForQuestionAnswering, a class provided in HuggingFace's transformers library. It is a subclass of torch.nn.module, i.e. it's a PyTorch model class.

andrelmfarias avatar Dec 19 '19 09:12 andrelmfarias

Do you have a pytorch version of a BertForQuestionAnswering model (HuggingFace's module version)?

If you have one, yes there is a way to do it, you would just need to do the folowing:

# Let's call your pytorch model Bert for QA custom_qa_model
from cdqa.reader import BertQA

reader = BertQA()
reader.model = custom_qa_model

# You need to send the model to CPU in order to save in the joblib format
reader.model.to('cpu')
reader.device = torch.device('cpu')

# Save it in the joblib format
import joblib
import os

joblib.dump(reader, os.path.join("path_to_directory", 'custom_qa_bert.joblib'))

Dear Andre,

Is there a way to directly convert a customized model posted in the Huggingface community? Similar to one of these: https://huggingface.co/mrm8488/t5-base-finetuned-squadv2

ZhiliWang avatar May 26 '20 16:05 ZhiliWang

Hi, I've tried to follow the steps but it give me error that my custom_qa_model is not defined.

What exactly do I have to assign to retriever.model?

kriskott avatar Jun 23 '20 18:06 kriskott

@bs317

Instantiate a custom model by 1) from transformers import AutoTokenizer, AutoModelForQuestionAnswering 2) tokenizer = AutoTokenizer.from_pretrained("clagator/biobert_squad2_cased") 3) model = AutoModelForQuestionAnswering.from_pretrained("clagator/biobert_squad2_cased") Now this 'model' is your custome_qa_model and follow the rest as described by @andrelmfarias

VenkatGudala avatar Jun 30 '20 13:06 VenkatGudala

@andrelmfarias Dear Andre,

Shall I modify and cdQA code to my needs?

VenkatGudala avatar Jun 30 '20 13:06 VenkatGudala

@bs317

Instantiate a custom model by 1) from transformers import AutoTokenizer, AutoModelForQuestionAnswering 2) tokenizer = AutoTokenizer.from_pretrained("clagator/biobert_squad2_cased") 3) model = AutoModelForQuestionAnswering.from_pretrained("clagator/biobert_squad2_cased") Now this 'model' is your custome_qa_model and follow the rest as described by @andrelmfarias

but the model is supposed to be my pytorch_model.bin file, I don't want to use the pre trained clagator/biobert_squad2_cased model. So I need to find a way to assign to model my pytorch_model.bin file.

kriskott avatar Jun 30 '20 13:06 kriskott

@bs317

Instantiate a custom model by 1) from transformers import AutoTokenizer, AutoModelForQuestionAnswering 2) tokenizer = AutoTokenizer.from_pretrained("clagator/biobert_squad2_cased") 3) model = AutoModelForQuestionAnswering.from_pretrained("clagator/biobert_squad2_cased") Now this 'model' is your custome_qa_model and follow the rest as described by @andrelmfarias

but the model is supposed to be my pytorch_model.bin file, I don't want to use the pre trained clagator/biobert_squad2_cased model. So I need to find a way to assign to model my pytorch_model.bin file.

@bs317 Did' you find a solution for what you are looking for?

Ashwith-mmtxt avatar Oct 30 '20 06:10 Ashwith-mmtxt

@andrelmfarias Can T5 models (like the model mentioned by @ZhiliWang ) be used with cdQA or do we need to write complete custom pipeline for those?

muhammadhaseebashraf avatar Apr 07 '22 11:04 muhammadhaseebashraf