Riza Suminto
Riza Suminto
[485e2ab](https://github.com/apache/iceberg/pull/4518/commits/485e2abf4bb3c67f7ab0db2e5cf2e78397bf25db) Switch ManifestCache class to use Guava Cache instead. This is easier to do in iceberg-api since we can simply include the Guava cache classes into the shadow jar of...
Hi @rdblue, thank you for your feedback. We found a slow query compilation issue against the Iceberg table in our recent Apache Impala build. Impala uses Iceberg's HiveCatalog and HadoopFileIO...
A relevant Impala's JIRA is here: https://issues.apache.org/jira/browse/IMPALA-11171
[c032b7a](https://github.com/apache/iceberg/pull/4518/commits/c032b7ab2f2db941fe5433c13a38dfcb8bf538ef) implement caching as a new FileIO class, CachingHadoopFileIO. A new Tables class, CachingHadoopTables, is also added to assist with testing. We tried to avoid `lazyStats()` call as much as...
Hello @rdblue. This PR is ready for review. Please let me know if there is any new feedback or request. I'm happy to follow up.
Got it, thank you. Will rebase and update this PR once they are merged.
Hi @danielcweeks @rdblue , can I get a follow up review on this PR please? I think we have not fully agree on how to decide which file to cache...
Hi @danielcweeks , given the immutability of snapshot and metadata, does the "potentially bad behaviors around reading metadata tables" still a concern? Immutability should make consistency easy to manage (can...
One way to avoid churn / pressure in cache is to detect what is the reason of most recent cache eviction. Say, if CachingFileIO tend to evict most entries in...
Hi @danielcweeks , thank you for the update. I will check PR #5207 and study how we can utilize it. So with that PR, am I correct that we should...