[Bug] DataGrip executes refresh command slowly
Code of Conduct
- [X] I agree to follow this project's Code of Conduct
Search before asking
- [X] I have searched in the issues and found no similar issues.
Describe the bug
When DataGrip executes refresh, I see that the following operations have been performed, mainly table list operations and table columns operations are time-consuming. Currently, my test results show that it takes four minutes to complete refresh.
Operations: Routine list in database Table list in database Table columns in database Keys in database.table
Affects Version(s)
1.10.0
Kyuubi Server Log Output
No response
Kyuubi Engine Log Output
No response
Kyuubi Server Configurations
No response
Kyuubi Engine Configurations
spark.master yarn
spark.yarn.queue default
spark.executor.cores 1
spark.driver.memory 3g
spark.executor.memory 3g
spark.dynamicAllocation.enabled true
spark.dynamicAllocation.shuffleTracking.enabled true
spark.dynamicAllocation.minExecutors 1
spark.dynamicAllocation.maxExecutors 10
spark.dynamicAllocation.initialExecutors 1
spark.cleaner.periodicGC.interval 5min
Additional context
No response
Are you willing to submit PR?
- [ ] Yes. I would be willing to submit a PR with guidance from the Kyuubi community to fix.
- [ ] No. I cannot submit a PR at this time.
Looks like DBeaver is smarter in this case, it lazily loads the table list when user expands a database, but DataGrip just fetches the full databases and tables eagerly, this would produce tons of HMS calls when database and table numbers are large, it would be slow even with enabling optimization introduced in https://github.com/apache/kyuubi/pull/6018, in that case, we can do further optimization by parallelizing the table listing under each database