Multi scale service deployment issues
Report
We observe scaling and performance issues in cluster with multiple deployments of scaled service.
- big and inconsistent delay on first request
- inconsistent duration to disable service(scaling to 0)
This issue is happening in aws but we were also able to reproduce it locally https://github.com/leska-j/scaling-issues-reproduction
Expected Behavior
Services are scaled to 0 in consistent time. First request time is dependent only on service startup time not on the number of deployed resources.
Actual Behavior
Even though service are scaled to 0 eventually, it can take even more then 40 minutes to do so. Even bigger problem is that this time is not consistent(it doesn't look to be some configured duration).
First request to scaled object takes too much time especially when multiple scaled services are deployed to the cluster.
Steps to Reproduce the Problem
- Please refere to https://github.com/leska-j/scaling-issues-reproduction
Logs from KEDA HTTP operator
No response
What version of the KEDA HTTP Add-on are you running?
master & 0.3.0 (with concurent access fix)
Kubernetes Version
No response
Platform
Amazon Web Services
Anything else?
No response