druid Create proportional lag autoscaler

Fixes #XXXX.

Description

Fixed the bug ...

Renamed the class ...

Added a forbidden-apis entry ...

Release note

Key changed/added classes in this PR

MyFoo
OurBar
TheirBaz

This PR has:

[ ] been self-reviewed.
- [ ] using the concurrency checklist (Remove this item if the PR doesn't have any relation to concurrency.)
[ ] added documentation for new or modified features or behaviors.
[ ] a release note entry in the PR description.
[ ] added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
[ ] added or updated version, license, or notice information in licenses.yaml
[ ] added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
[ ] added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
[ ] added integration tests.
[ ] been tested in a test Druid cluster.

Apr 11 '25 20:04 jtuglu1

@kfaraz thoughts here? The idea behind this is to scale proportionally to the lag, since scaling is an expensive operation (from an accrued lag perspective).

May 03 '25 06:05 jtuglu1

@jtuglu-netflix , I haven't taken a look at the patch yet.

Could you please give me an idea of the following?

Do you feel that the current scale up logic adds too many tasks or too few?
Similarly, what do you feel about the current scale down logic?
In this patch, what is the formula to compute the "proportional" task count for a given value of lag? Does it also incorporate the current processing rate of each

To add to @cryptoe 's point, in production clusters, we have seen that scaling to a task count that does not perfectly divide number of topic partitions always causes grief.

So, say for a Kafka supervisor reading from a topic with 100 partitions, the valid task count options should be: 1, 2, 4, 5, 10, 20, 25, 50, 100

So it would be nice if we could do take a combined approach in this PR i.e. proportional scaling to a valid task count.

So, if proportional logic dictates that the required task count is 23, we go with 25. If we need 60, we just go with 50 (since that is the closest valid count).

May 03 '25 08:05 kfaraz

Do you feel that the current scale up logic adds too many tasks or too few?

Depending on the scenario, it could be either. What I was aiming for was to minimize the "saw-toothing" or late reactions to spikes (if say the minimum time between scaling is long, e.g ~10+ minutes).

In this patch, what is the formula to compute the "proportional" task count for a given value of lag?

I took the average (call it A) of the samples of desired aggregate function across partitions, and compare that to the scaling threshold T (up/down). I then calculate newTaskCount = ceil(currentTaskCount * A/T), apply a max scaling step and use the "ratio" config params to determine whether if newTaskCount falls within some tolerance % of currentTaskCount, we skip the scaling activity or not.

May 03 '25 08:05 jtuglu1

So it would be nice if we could do take a combined approach in this PR i.e. proportional scaling to a valid task count.

Yes, this is something I've been meaning to fix as well.

May 03 '25 08:05 jtuglu1

Sounds good, @jtuglu-netflix , please update this PR accordingly with both the changes. I will be happy to do a review.

For the "valid" task count logic, you could just update your applyMinMaxCheck method to perform the required checks.

May 03 '25 08:05 kfaraz

Sounds good, @jtuglu-netflix , please update this PR accordingly with both the changes. I will be happy to do a review.

For the "valid" task count logic, you could just update your applyMinMaxCheck method to perform the required checks.

In this case, the proportional scaling config parameters' meanings differ from what their documented use is with the threshold-based scaler. I'm wondering if this is something we'd want to incorporate into the current scaler, or perhaps create a new one altogether.

May 03 '25 08:05 jtuglu1

In this case, the proportional scaling config parameters' meanings differ from what their documented use is with the threshold-based scaler. I'm wondering if this is something we'd want to incorporate into the current scaler, or perhaps create a new one altogether.

Thanks for pointing this out, @jtuglu-netflix ! In general, I feel that it is always better to improve the logic and update the documentation accordingly rather than stick to a poorer and buggy implementation. However, I will try to take another look at the docs + code and double check if this can have any adverse or undesirable impact on existing supervisor specs.

May 08 '25 14:05 kfaraz

In this case, the proportional scaling config parameters' meanings differ from what their documented use is with the threshold-based scaler. I'm wondering if this is something we'd want to incorporate into the current scaler, or perhaps create a new one altogether.

Thanks for pointing this out, @jtuglu-netflix ! In general, I feel that it is always better to improve the logic and update the documentation accordingly rather than stick to a poorer and buggy implementation. However, I will try to take another look at the docs + code and double check if this can have any adverse or undesirable impact on existing supervisor specs.

@kfaraz, see here.

May 13 '25 06:05 jtuglu1