Maytas Monsereenusorn

Results 10 comments of Maytas Monsereenusorn

Hi @JulianJaffePinterest, I saw that https://github.com/apache/druid/pull/11474 and https://github.com/apache/druid/pull/11823 was already merged into `apache:spark_druid_connector` and that this PR (write support) is the only piece left. My aim is to review this...

@jon-wei @capistrant Is this PR good to merge? @jon-wei , I tried clicking on https://github.com/apache/druid/pull/12599/files#r940734962 in your previous comment but i'm not seeing any comment when I open the link

Should used_flag_last_updated column be part of the index in the segment table?

I think it's worth reopening this issue. Even if you increase maximum number of opened file descriptors on system level, opening many tmp files can caused your historical to OOM....

On a cluster with 600k active segments, this patch reduce the time to build timeline from 160,000ms to 2,000ms. On a cluster with ~7 million active segments, this patch reduce...

One more thing to add to (2) from the above, the PR https://github.com/apache/druid/pull/14533 may also help with (2). This may help if the supervisor is configured with a lot of...

+1 on this feature. @abhishekagarwal87 and I have this discussion a while back at https://github.com/apache/druid/pull/14424#issuecomment-1933738231 Looking forward to the PR!

@abhishekagarwal87 This idea came from my discussion with @gianm and @gianm suggested this change (see: https://apachedruidworkspace.slack.com/archives/C030CMF6B70/p1746578390954639?thread_ts=1745436989.786489&cid=C030CMF6B70) We have also observed some queries that takes on avg 1-2 minutes to process...

Seems like we used to have something like https://github.com/apache/iceberg-python/commit/4f0a5c6203888ff105c1f09f41c17245f477d2ab but it's gone? @Fokko @TGooch44

@Fokko Thanks for getting back to me. I can look into contributing. I am not too familiar with the new pyiceberg rewrite (current state of this library) but was wondering...