Upgrade Helix dependency to 1.4.0
Codecov Report
Attention: Patch coverage is 8.33333% with 11 lines in your changes missing coverage. Please review.
Project coverage is 35.13%. Comparing base (
59551e4) to head (5bb4bf3). Report is 805 commits behind head on master.
:exclamation: There is a different number of reports uploaded between BASE (59551e4) and HEAD (5bb4bf3). Click for more details.
HEAD has 35 uploads less than BASE
Flag BASE (59551e4) HEAD (5bb4bf3) integration 7 2 integration2 3 0 temurin 12 5 java-21 7 2 skip-bytebuffers-true 3 1 skip-bytebuffers-false 7 3 unittests 5 3 java-11 5 3 unittests2 3 0 integration1 2 1 custom-integration1 2 1
Additional details and impacted files
@@ Coverage Diff @@
## master #13486 +/- ##
=============================================
- Coverage 61.75% 35.13% -26.63%
+ Complexity 207 6 -201
=============================================
Files 2436 2479 +43
Lines 133233 136927 +3694
Branches 20636 21313 +677
=============================================
- Hits 82274 48103 -34171
- Misses 44911 85176 +40265
+ Partials 6048 3648 -2400
| Flag | Coverage Δ | |
|---|---|---|
| custom-integration1 | <0.01% <0.00%> (-0.01%) |
:arrow_down: |
| integration | <0.01% <0.00%> (-0.01%) |
:arrow_down: |
| integration1 | <0.01% <0.00%> (-0.01%) |
:arrow_down: |
| integration2 | ? |
|
| java-11 | 35.11% <8.33%> (-26.60%) |
:arrow_down: |
| java-21 | 46.26% <25.00%> (-15.36%) |
:arrow_down: |
| skip-bytebuffers-false | 35.12% <8.33%> (-26.63%) |
:arrow_down: |
| skip-bytebuffers-true | 46.24% <25.00%> (+18.52%) |
:arrow_up: |
| temurin | 35.13% <8.33%> (-26.63%) |
:arrow_down: |
| unittests | 46.42% <25.00%> (-15.33%) |
:arrow_down: |
| unittests1 | 46.42% <25.00%> (-0.47%) |
:arrow_down: |
| unittests2 | ? |
Flags with carried forward coverage won't be shown. Click here to find out more.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
@jackjlli - We should discuss next week if / how we plan to validate this internally first ?
@abhioncbr - Do we already know of any significant changes since the last upgrade ?
@abhioncbr - Do we already know of any significant changes since the last upgrade ?
I have to look more into details, in the first look seems like nothing much signifacnt changed in Helix. We are getting failures in the restart kinds of use cases. I will dog more. Thanks
Thanks @abhioncbr for working on it! Please keep us posted about the testing results against the new Helix release.
We are seeing a lot of test failures in Pinot because of the changes in this Helix PR
I am raising the issue of Helix removing the condition or working around this. cc: @jackjlli / @Jackie-Jiang
Raised the bug on Apache Helix
@siddharthteotia @jackjlli Any update on the internally testing? Is there new introduced behavior change in Helix? A lot of tests are failing
Hi @abhioncbr, after looking at the exception message in the test a bit, I think in our tests when a table gets newly created, we should add some validation checks to make sure if the table is fully ready before testing out anything else.
For example, in testListConfigs() in the TableConfigsRestletResourceTest class, we should add the following validation logic right after the table creation request is sent (in your PR, it should be added after Line 362):
String getExternalViewUrl = DEFAULT_INSTANCE.getControllerRequestURLBuilder().forTableExternalView(tableName1);
TestUtils.waitForCondition(aVoid -> {
// should wait for both realtime and offline table external view to be live.
try {
String externalViewJsonResponse = sendGetRequest(getExternalViewUrl);
TableViews.TableView externalView = JsonUtils.stringToObject(externalViewJsonResponse, TableViews.TableView.class);
return externalView._offline != null;
} catch (IOException e) {
return false;
}
}, 30_000L, "Failed to find the table creation from the ExternalView");
Basically, we should never expect that all the znodes are created once the table creation request was made, as any updates on external view should be done by Helix controller. Same for other places where the table creation request was made in the same test class. @abhioncbr can you address that in your PR?
@jackjlli But this doesn't explain why the tests are not failing before the upgrade. For testListConfigs(), why do we need to wait for EV to appear? Table list API should only check the IS instead of EV, unless there are behavior changes in Helix
@Jackie-Jiang If you check the failed messages from the Github actions, it shows that the exception was thrown when trying to update the table config (not trying to list the table config), which means the API call to list table config has passed before attempting to update the table config:
pinot.common.exception.HttpErrorStatusException: Got error status code: 500 (Internal Server Error) with reason: "Failed to update TableConfigs for: testList1, Specified EXTERNALVIEW brokerResource is not found!"
03:41:16.876 ERROR [TableConfigsRestletResource] [grizzly-http-server-7] Failed to update TableConfigs for: testList1, Specified EXTERNALVIEW brokerResource is not found!
org.apache.helix.HelixException: Specified EXTERNALVIEW brokerResource is not found!
at org.apache.helix.messaging.CriteriaEvaluator.getProperty(CriteriaEvaluator.java:196) ~[helix-core-1.4.0.jar:1.4.0]
at org.apache.helix.messaging.CriteriaEvaluator.evaluateCriteria(CriteriaEvaluator.java:75) ~[helix-core-1.4.0.jar:1.4.0]
at org.apache.helix.messaging.DefaultMessagingService.generateMessagesForParticipant(DefaultMessagingService.java:192) ~[helix-core-1.4.0.jar:1.4.0]
at org.apache.helix.messaging.DefaultMessagingService.generateMessage(DefaultMessagingService.java:182) ~[helix-core-1.4.0.jar:1.4.0]
at org.apache.helix.messaging.DefaultMessagingService.send(DefaultMessagingService.java:102) ~[helix-core-1.4.0.jar:1.4.0]
at org.apache.helix.messaging.DefaultMessagingService.send(DefaultMessagingService.java:96) ~[helix-core-1.4.0.jar:1.4.0]
at org.apache.pinot.controller.helix.core.PinotHelixResourceManager.sendTableConfigRefreshMessage(PinotHelixResourceManager.java:2750) ~[classes/:?]
at org.apache.pinot.controller.helix.core.PinotHelixResourceManager.setExistingTableConfig(PinotHelixResourceManager.java:1962) ~[classes/:?]
at org.apache.pinot.controller.helix.core.PinotHelixResourceManager.setExistingTableConfig(PinotHelixResourceManager.java:1931) ~[classes/:?]
at org.apache.pinot.controller.helix.core.PinotHelixResourceManager.updateTableConfig(PinotHelixResourceManager.java:1923) ~[classes/:?]
at org.apache.pinot.controller.api.resources.TableConfigsRestletResource.updateConfig(TableConfigsRestletResource.java:355) ~[classes/:?]
at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103) ~[?:?]
at java.base/java.lang.reflect.Method.invoke(Method.java:580) ~[?:?]
Here is the sequence of the API calls:
TableConfigs tableConfigs = new TableConfigs(tableName1, schema, offlineTableConfig, null);
sendPostRequest(_createTableConfigsUrl, tableConfigs.toPrettyJsonString());
// list
String getResponse = sendGetRequest(DEFAULT_INSTANCE.getControllerRequestURLBuilder().forTableConfigsList());
List<String> configs = JsonUtils.stringToObject(getResponse, new TypeReference<List<String>>() {
});
Assert.assertEquals(configs.size(), 1);
Assert.assertTrue(configs.containsAll(Sets.newHashSet(tableName1)));
// update to 2
tableConfigs = new TableConfigs(tableName1, schema, offlineTableConfig, realtimeTableConfig);
sendPutRequest(DEFAULT_INSTANCE.getControllerRequestURLBuilder().forTableConfigsUpdate(tableName1),
tableConfigs.toPrettyJsonString());
The evaluation logic has been there for quite a while. https://github.com/apache/helix/blame/master/helix-core/src/main/java/org/apache/helix/messaging/CriteriaEvaluator.java#L75 and the default data source has always been EV: https://github.com/apache/helix/blame/master/helix-core/src/main/java/org/apache/helix/Criteria.java#L68
So I guess in the later release, something has changed in a way that makes EV to be populated much later compared to the previous release. Adding @zpinto @MarkGaox here for more context on this?
@abhioncbr , shall we consider upgrading Pinot to https://github.com/apache/helix/releases/tag/helix-1.4.3 version?
@abhioncbr , shall we consider upgrading Pinot to https://github.com/apache/helix/releases/tag/helix-1.4.3 version?
Sure, we can definitely try that.
closing in favour of https://github.com/apache/pinot/pull/15063