file_system_poll_wait_seconds default (1s) is very expensive for remote (GS://) model paths
We just realized that we were spending quite a bit of money on merely polling for version updates for our models which are hosted in GCS.
It was doing a Class A operation (list objects) once per second due to https://github.com/tensorflow/serving/blob/6b1a02b5fc63def9b4cfd75bd9dbce9bed4c10bb/tensorflow_serving/model_servers/server.h#L68.
We've solved our cost issue just by decreasing that to 1 minute, but perhaps there could be a warning message or a different default for remote storage model paths that could incur a cost if it's left at the default.
This indeed can be quite a problem. Would it be possible to have an API to refresh a particular served model? In this way, one would not need to poll for newer model versions and can be part of a deployment pipeline.
@pselden,
You can issue HandleReloadConfigRequest RPC calls to the server and supplying a new Model Server config programmatically to load the updated model.
Thank you!
Closing this due to inactivity. Please take a look into the answers provided above, feel free to reopen and post your comments(if you still have queries on this). Thank you!