Prediction is not thread-safe
Describe the bug
I have a flask app that has an endpoint that looks something like this, where model is FullBayesianForecaster:
@app.route("/predict", methods=["POST"])
def predict():
df = get_df()
result = model.predict(df, seed=1234)
return result.to_dict(orient="records")
I tested it locally with curl and it worked great: the flask app returned the same predictions for the same input. I then deployed it to a test environment, and another service that talks to it immediately started getting non-deterministic results for requests with the same parameters. I checked the prediction results with another curl request to the test environment, and got the same result back every time, as expected.
After scratching my head a bit, I tried using threading.Lock like this:
with lock:
result = model.predict(df, seed=1234)
And then the non-determinism went away. This leads me to believe that model prediction is not thread safe. I haven't yet dug down far enough to know for sure what's causing the issue, but this seems to be at least one likely culprit:
https://github.com/uber/orbit/blob/27371ec/orbit/forecaster/full_bayes.py#L96-L97
If multiple threads are calling model.predict such that they fight over the value of self._prediction_meta here, that seems likely to cause predictions to be not thread-safe:
https://github.com/uber/orbit/blob/c232980/orbit/forecaster/forecaster.py#L389
To Reproduce
Try running model.predict with different parameters in the presence of multiple threads, and set seed=a_fixed_number.
Expected behavior
Calling model.predict with seed=a_fixed_number in the presence of multiple threads returns deterministic predictions.
Screenshots
N/A
Environment (please complete the following information):
- OS: Ubuntu
- Python Version: 3.9
- Versions of Major Dependencies (
pandas,scikit-learn,cython): [e.g.pandas==1.3.5,scikit-learn==<not installed>,cython==0.29.30]
Additional context
N/A
@knkski are you saying you get same result across threads even with the setting seed=None ?
@edwinnglabs I haven't tested that, I'd assume that I'd get different results each time then. This issue arose after setting the random seed, assuming that would fix the non-deterministic predictions I was seeing. Since I started getting deterministic predictions with a thread lock, I didn't try seeing what happens with a thread lock, but without a random seed.
So you want to see same behavior on each thread you call model.predict? If so, the other around as you described should work by using seed=a_fixed_number. Please let me know if I understand the problem correctly...
Yeah, I'm looking to get deterministic predictions on every thread, given that I pass in seed=a_fixed_number. Right now I'm not seeing that; using multiple threads will return non-deterministic predictions, even if I set seed=a_fixed_number, and that's what I think the bug is.