tutorials icon indicating copy to clipboard operation
tutorials copied to clipboard

Add example model package

Open Nic-Ma opened this issue 4 years ago • 2 comments

Is your feature request related to a problem? Please describe. Sub task of ticket https://github.com/Project-MONAI/MONAI/issues/3482 Create an example of model package.

Nic-Ma avatar Dec 16 '21 01:12 Nic-Ma

Hi, in a MONAI deploy WG meeting last week @ericspod mentioned that it might be useful to comment here on experiences with other app packaging frameworks. I just wanted to share my experiences with the MLFlow python_function "flavor", Unfortunately, I'm not able to share all the source code at this time, but I'll try to describe it best I can with a short excerpt. The model itself is an NLP application for classifying endoscopy reports based on transformers pre-trained BERT model.

The basis of the package is this wrapper class which just requires __init__ and a predict methods. At the end of my training script, an instance of this class is created using an input model (in this case a LightningModule) and the tokenizer. This wrapped model object is then logged to mlflow where it can be deployed through mlflow's model serving framework.

"""
mlflow python_model wrapper class for Barretts model
"""
import mlflow
import pandas as pd
from torch import topk
from project.BarrettsDataset import BarrettsDataset
import logging

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)


class BarrettsWrapper(mlflow.pyfunc.PythonModel):

    def __init__(self, model, tokenizer):
        self.model = model
        self.tokenizer = tokenizer
        self.dataset = BarrettsDataset(tokenizer=tokenizer)

    def predict(self, context, model_input):
        logger.info('Running prediction service ')
        encoding = self.dataset.encoder(model_input.diag_final.tolist())

        _, test_prediction = self.model(encoding["input_ids"], encoding["attention_mask"])
        res = topk(test_prediction, 1).indices.tolist()

        confidences = pd.DataFrame(test_prediction.tolist(), columns=self.dataset.label_columns)
        prediction = pd.DataFrame({'Predicition': [self.dataset.label_columns[x[0]] for x in res]})

        results = pd.concat([prediction, confidences], axis=1)
        return results

# *** training code goes here, "model" is trained LightningModule, test_df is example input dataframe ***

wrappedModel = BarrettsWrapper(model, tokenizer)
signature = mlflow.models.signature.infer_signature(test_df, wrappedModel.predict(None, test_df))
mlflow.pyfunc.log_model('barretts_nlp', python_model=wrappedModel, signature=signature, code_path=['.'])

Some things I think might be useful to mention:

  • From the attached PR it looks like torchscript forms the basis of the packaged applications. I wasn't able to package this particular application with TS with trace or scripting due to compatibility issues with the transformers library. I know this is an NLP model so not necessarily within MONAIs remit but might be worth mentioning anyway as there may be other package incompatibilities out there and the torchscipt logs were difficult to debug.
  • One helpful feature was when logging the model to mlflow it is logged with a signature that describes the input and output shapes, this is automatically collected by running an (infer_signature function)[https://www.mlflow.org/docs/latest/python_api/mlflow.models.html#mlflow.models.infer_signature] using example inputs and outputs. I think something like this to partially automate/check the config .json files in #487 would be useful.
  • Having a customisable wrapper class was useful since it abstracted away a lot of complexity but also left me free to customise the predict method. This meant i could add extra functionality on top of just callignt he model e.g. in this case I could add a few extra lines to put the output of the model into a dataframe for returning to the user.

I hope some of this rambling is useful, these are just a few things which come to mind, very happy to discuss if you have any questions!

laurencejackson avatar Jan 21 '22 17:01 laurencejackson

Hi @laurencejackson ,

Thanks so much for detailed sharing and feedback! About the things you mentioned:

  1. Yes, TorchScript DO has some compatible issues with latest PyTorch APIs, we also have transformer based networks, like UNETR, etc. @ahatamiz is also working on the TorchScript support for it. We suppose to support TorchScript for all the MONAI networks (maybe not all the parameters combination of the networks).
  2. Exactly, it's same as the verification schema in our proposal, I put into the task 5 and task 6 of https://github.com/Project-MONAI/MONAI/issues/3482 and already implemented a network verification example in: https://github.com/Project-MONAI/tutorials/pull/487.
  3. We suppose to support Hybrid programming in the model package, so you can easily write your wrapper logic in the python script layer. Or if you want to directly use MONAI trainer or evaluator, you can refer to: https://github.com/Project-MONAI/MONAI/blob/dev/monai/engines/evaluator.py#L143.

Thanks.

Nic-Ma avatar Jan 25 '22 09:01 Nic-Ma