MLFLOW support for the SilverKite Algorithm
Hello, I am trying to log the sklearn pipeline in the model attribute to MLFLOW on databricks with the silverkite template. See the code snippet below. When attrempting to log the sklearn.pipeline that is in the model attribute of the "greykite.framework.pipeline.pipeline.ForecastResult" object I recieve an error message stating: NotImplementedError: Sorry, pickling not yet supported. See https://github.com/pydata/patsy/issues/26 if you want to help.
Any idea on how I can log this model with the silverkite template to MLFLOW?
from collections import defaultdict
import warnings
warnings.filterwarnings("ignore")
import pandas as pd
import plotly
from greykite.common.data_loader import DataLoader
from greykite.framework.templates.autogen.forecast_config import ForecastConfig
from greykite.framework.templates.autogen.forecast_config import MetadataParam
from greykite.framework.templates.forecaster import Forecaster
from greykite.framework.templates.model_templates import ModelTemplateEnum
from greykite.framework.utils.result_summary import summarize_grid_search_results
import mlflow
# Loads dataset into pandas DataFrame
dl = DataLoader()
df = dl.load_peyton_manning()
# specify dataset information
metadata = MetadataParam(
time_col="ts", # name of the time column ("date" in example above)
value_col="y", # name of the value column ("sessions" in example above)
freq="D" # "H" for hourly, "D" for daily, "W" for weekly, etc.
# Any format accepted by `pandas.date_range`
)
forecaster = Forecaster() # Creates forecasts and stores the result
result = forecaster.run_forecast_config( # result is also stored as `forecaster.forecast_result`.
df=df,
config=ForecastConfig(
model_template=ModelTemplateEnum.SILVERKITE.name,
forecast_horizon=365, # forecasts 365 steps ahead
coverage=0.95, # 95% prediction intervals
metadata_param=metadata
)
)
mlflow.sklearn.log_model(result.model, "model")
I'd like to add that if I change if I change the model_template from ModelTemplateEnum.SILVERKITE.name to ModelTemplateEnum.PROPHET.name the code works fine and I am able to log the model and read the model just fine.
Any advice on how to utlize MLFLLOW with the silverkite template?
This is related to #73.
Long story short, the Silverkite template is tricky to serialize, due to the internal use of patsy. On the other hand, Prophet is pickable.
To log model artifacts to mlflow using the Silverkite template, you can dump them in a local path (using forecaster.dump_forecast_result), then the whole path can be logged to mlflow via mlflow.log_artifact.
Beware that forecaster.dump_forecast_result, as far as I know, does not work on Windows.
More info on model storing and loading are available here.