kyuubi icon indicating copy to clipboard operation
kyuubi copied to clipboard

[TASK][EASY] Kyuubi Server HA&ZK get server from serverHosts support more strategy

Open davidyuan1223 opened this issue 2 years ago • 7 comments

Code of Conduct

Search before asking

  • [X] I have searched in the issues and found no similar issues.

What would you like to be improved?

The current Kyuubi HA mode, which retrieves servers from ZooKeeper, only supports the random strategy. This may lead to an overload on certain nodes. Therefore, in order to address the overload issue, it is necessary to support more strategies.

How should we improve?

Update Kyuubi Hive JDBC to support ZooKeeperClientHelper to support more strategies, currently, there are two strategy:

  1. Random
  2. Polling

Are you willing to submit PR?

  • [X] Yes. I would be willing to submit a PR with guidance from the Kyuubi community to improve.
  • [ ] No. I cannot submit a PR at this time.

davidyuan1223 avatar Jan 31 '24 06:01 davidyuan1223

@wForget @pan3793

davidyuan1223 avatar Jan 31 '24 11:01 davidyuan1223

SGTM, and it's better to extract an Interface to allow user to implement their custom strategy

pan3793 avatar Jan 31 '24 11:01 pan3793

SGTM, and it's better to extract an Interface to allow user to implement their custom strategy

hello, want ask a question, kyuubi-hive-jdbc is only a driver engine, which cannot read configuration from the kyuubiConf, so if we set a strategy configEntry in HA, the kyuubi-hive-jdbc also cannot read the config, the only way i think is to add in connection params, but if add in connection params, we cannot custom our strategy, what do you think?

davidyuan1223 avatar Feb 06 '24 07:02 davidyuan1223

@davidyuan1223 hello, May I ask what strategy will you implement?

sunnyzhuzhu avatar Mar 08 '24 09:03 sunnyzhuzhu

@davidyuan1223 hello, May I ask what strategy will you implement?

sorry, forget response, curruntly, i implemented poll and random, because hive-jdbc module is a single module, we can not use kyuubi-ha module, so, if we want implemented more strategies, we only can add strategy in connection params, like '&zkStartegy=poll/random', if you have more useful starategy, you can give me some advice

davidyuan1223 avatar Mar 22 '24 07:03 davidyuan1223

@davidyuan1223 hello, May I ask what strategy will you implement?

hello, this is the demo command bin/beeline -u 'jdbc:hive2://xxx:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=kyuubi;zooKeeperStrategy=poll?spark.app.name=testspark;spark.shuffle.useOldFetchProtocol=true' -n hadoop --verbose=true --showNestedErrs=true currently it can use poll strategy to choose the right server, but there are some bugs, so i'm not commit a pr.

davidyuan1223 avatar Mar 26 '24 07:03 davidyuan1223

@davidyuan1223 hello, May I ask what strategy will you implement?

hello, this is the demo command bin/beeline -u 'jdbc:hive2://xxx:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=kyuubi;zooKeeperStrategy=poll?spark.app.name=testspark;spark.shuffle.useOldFetchProtocol=true' -n hadoop --verbose=true --showNestedErrs=true currently it can use poll strategy to choose the right server, but there are some bugs, so i'm not commit a pr.

i plan the user could implemented a interface named org.apache.kyuubi.jdbc.hive.strategy.ChooseServerStrategy, then use zooKeeperStrategy=xxx.xxx.xxx, so user can use themselves implement plan, of course, if you have more effective plan, you can offered me, and i will try to implement them

davidyuan1223 avatar Mar 26 '24 07:03 davidyuan1223