[Bug] [spark-hive-connector] failed to setting hive.metastore.uri if not setting `spark.sql.hive.metastore.jars`
Code of Conduct
- [X] I agree to follow this project's Code of Conduct
Search before asking
- [X] I have searched in the issues and found no similar issues.
Describe the bug
- start two hive metastore,
127.0.0.1:9083as hive1 and127.0.0.1:19083as hive2 - start spark SQL client, setting default hive metastore address to hive2, and the hive metastore address of
hive_catalogto hive1
./bin/spark-sql -v \
--conf spark.sql.catalog.hive_catalog="org.apache.kyuubi.spark.connector.hive.HiveTableCatalog" \
--conf spark.sql.catalog.hive_catalog.hive.metastore.uris=thrift://127.0.0.1:9083 \
--conf spark.sql.catalog.hive_catalog.hive.metastore.port=9083 \
--conf spark.hadoop.hive.metastore.uris=thrift://127.0.0.1:19083
- run spark sqls, after
using hive_catalog, show databases retrives the database from hive2 not hive1.
Affects Version(s)
1.8.1
Kyuubi Server Log Output
No response
Kyuubi Engine Log Output
No response
Kyuubi Server Configurations
No response
Kyuubi Engine Configurations
No response
Additional context
No response
Are you willing to submit PR?
- [ ] Yes. I would be willing to submit a PR with guidance from the Kyuubi community to fix.
- [X] No. I cannot submit a PR at this time.
DON'T use spark-sql to test the hive-related stuff, there are a lot of tricky inside, is it reproducible with spark-shell?
The main reason is if not specify spark.sql.hive.metastore.jars, HiveClientImpl will use a shared SessionState to create a Hive client, and the shared SessionState is initiated by spark_catalog catalog first in which the hive client is hive2 in this case.
// `isolationOn` is true if `spark.sql.hive.metastore.jars` is `buildin` and SessionState is CliSessionState
def isCliSessionState(): Boolean = {
val state = SessionState.get
var temp: Class[_] = if (state != null) state.getClass else null
var found = false
while (temp != null && !found) {
found = temp.getName == "org.apache.hadoop.hive.cli.CliSessionState"
temp = temp.getSuperclass
}
found
}
// create or reuse session state according to the `clientLoader.isolationOn`
val state: SessionState = {
if (clientLoader.isolationOn) {
newState()
} else {
SessionState.get
}
}
// get conf from session state
def conf: HiveConf = {
val hiveConf = state.getConf
}
// create Hive client from conf
private def client: Hive = {
if (clientLoader.cachedHive != null) {
clientLoader.cachedHive.asInstanceOf[Hive]
} else {
val c = getHive(conf)
clientLoader.cachedHive = c
c
}
}
DON'T use
spark-sqlto test the hive-related stuff, there are a lot of tricky inside, is it reproducible withspark-shell
It can't be reproduced with spark-shell
KSHC does not work well with spark-sql is a known issue, we don't have a plan to fix it on the Kyuubi side, because we treat it as a Spark side issue.
Kyuubi is a full drop-in replacement of spark-sql.
spark-sql ==> beeline => kyuubi => spark driver (client or cluster mode)
Close as not planned