Feature: Allow adding additional information to metadata on model upload
There is to my knowledge no straight forward way of retrieving additional data sent on model upload other than downloading the entire artifact and knowing the exact name of the file that it was stored in. It would be nice to be able to add additional information to the model metadata when uploading a new model in order to have direct access to any important information needed for further processing of models.
This could be an optional parameter to the upload method which provides an easy way to add something to the metadata. This could accept a python dictionary and would then be placed in the metadata under a specific key such as "extra".
Use case
# Custom information that a user wants to have available as metadata when calling `get_model_info`
important_info = {
'required_columns': ["yay", "nay"],
'data_transforms': ["std", "mean"],
'training_data_marker': {
'index_column': 'some_id',
'index_value': 'some_value',
},
'replication_storage_information': {
"actual_creation_date": "2021-11-23T10:10:23",
"archived_date": "2022-1-14T12:14:23",
}
}
metadata = model_store.upload(
domain="my-domain",
state_name="archived",
model=lr_model,
extra_metadata=important_info
)
print(metadata)
>>
{
'model': {
'domain': {...},
'data': {...},
'storage': {...},
'code': {...},
'git': {...},
'extra': {
'required_columns': ["yay", "nay"],
'data_transforms': ["std", "mean"],
'training_data_marker': {
'index_column': 'some_id',
'index_value': 'some_value',
},
'replication_storage_information': {
"actual_creation_date": "2021-11-23T10:10:23",
"archived_date": "2022-1-14T12:14:23",
}
}
}
}
The extra parameter would have to be validated which could be done by checking whether the object is json serializable in the update method
if extra_metadata:
try:
json.dumps(extra_metadata)
except Exception:
raise ValueError("extra_metadata field must be json serializable")
The value of the field could be defaulted to an empty dict i.e 'extra': {} and should not break any existing functionality.
Any opinions on this?
Great idea! I've wanted to do this for some time, and you suggesting it might just be the motivation I needed 😄
I'm currently in the middle of moving the meta data implementation to use dataclasses:
- https://github.com/operatorai/modelstore/pull/178
- https://github.com/operatorai/modelstore/pull/182
Once that is done, I can definitely add this in and bundle it all together for the next release 🙌
👋 @hauks96 I've now added this in, and so it will go out with the next release. Thank you for the suggestion, and if you have any more ideas feel free to open more issues or reach out to me directly!
- https://github.com/operatorai/modelstore/pull/185
@nlathia Brilliant! Thank you so much, looking forward to use it 😄
✅ This was released as part of modelstore==0.0.75
- https://github.com/operatorai/modelstore/pull/201
Let me know if you see any other issues!