API breaks when defining custom elements in model
Describe the bug
When defining a custom element to use in a model, the the API for serving up the model breaks. For example, using this simple no-op transformer results in the API breaking with a error message like AttributeError: Can't get attribute 'MyTransformer' on <module '__main__' from '...'> even when loaded into an API like https://github.com/brooklynbagel/vetiver-reprex-custom-elements/blob/3ec27d180be0c6ec115af1554fbd2b8f830fa73b/attempt-2/api/app.py.
class MyTransformer(TransformerMixin, BaseEstimator):
def fit(self, X, y=None):
return self
def transform(self, X):
return X
To Reproduce Steps to reproduce the behavior:
- Define a custom element to use a model
- Deploy said model
- Either deploy API for model or run locally with
uvicorn app:api - See error
AttributeError: Can't get attribute 'MyTransformer' on <module '__main__' from '...'>
Expected behavior API should start up normally with no error
Screenshots If applicable, add screenshots to help explain your problem.
Desktop (please complete the following information):
- OS: [e.g. iOS]
- Browser [e.g. chrome, safari]
- Version [e.g. 22]
Smartphone (please complete the following information):
- Device: [e.g. iPhone6]
- OS: [e.g. iOS8.1]
- Browser [e.g. stock browser, safari]
- Version [e.g. 22]
Additional context
See reprex
It does work when forcing the transformer into module __main__, see
class MyTransformer(TransformerMixin, BaseEstimator):
def fit(self, X, y=None):
return self
def transform(self, X):
return X
# this fixes the `AttributeError`
setattr(sys.modules["__main__"], "MyTransformer", MyTransformer)
API deployed on dogfood: https://connect.posit.it/content/aaac8d80-fb22-48d0-98df-bf1683f91170
Thank you so much for this report! I believe the error you are running into is based off of how you are pinning the model in conjunction with how you are deploying the model.
pickle is very flexible (perhaps to a fault 😅) and rather than remembering the source code, it just remembers how to get to the location it was imported. In your scenario, when you are creating and pinning the model in the same file (say model.py), in the pickle, the path would be __main__.MyTransformer. When you are importing it into the app.py file to run the API itself, the loaded path for MyTransformer would be from model.MyTransformer, which pickle doesn't know what to do with. A fix would be something like, if you had a file model.py that built the model and a second deploy.py to load the model/pin/deploy it. That way, it is always known where to find the model. Let me know if that helps!
There's probably room for a better error message here, or maybe some docs on why this is important. I'm open to hearing what you think is important to help clarify this to others!
That makes sense w.r.t. pickle just remembering the location of the module. I think some better documentation and error reporting would be helpful at a minimum.
It would be nice if there were a nicer developer experience of having to define your model in a separate model.py than the .py, .ipynb or .qmd where you're working from. I'm wondering if it would be possible to 'trick' pickle (when deploying the model) into thinking custom modules are where the FastAPI app.py would expect them or if this creates even more problems.