Denis Krivenko

Results 49 comments of Denis Krivenko

@zsxwing It takes from ~2h to more for the highest `memoryOverheadFactor`.

Heap dump ~20 minutes after start ``` Class Name | Objects | Shallow Heap | Retained Heap ----------------------------------------------------------------------------------- org.apache.hadoop.conf.Configuration | 130 | 6,240 | >= 46,086,424 ----------------------------------------------------------------------------------- ``` Heap dump...

I don't see anything else for `DeserializedMemoryEntry` objects

Inside `DeserializedMemoryEntry` I found: 1. Four `java.lang.Object[1]` ``` Type|Name|Value --------------------------------------------------------------------------------- ref |[0] |org.apache.spark.sql.execution.joins.UnsafeHashedRelation @ 0x6ef5c50c8 ref |[0] |org.apache.spark.sql.execution.joins.LongHashedRelation @ 0x6ef672460 ref |[0] |org.apache.spark.sql.execution.joins.UnsafeHashedRelation @ 0x6f015ebc0 ref |[0] |org.apache.spark.sql.execution.joins.LongHashedRelation @ 0x6f630fe38...

Unfortunately yes, there is sensitive data there. Actually I had the same idea to share heap dump with you. I'm going to reproduce the issue with public data (NY taxi...

@zsxwing Did you have a chance to look at the heap dump I provided? Is there anything else I can do to help to find the reason of the issue?

> Sorry for the delay. I didn't find any clue. The heap memory seems normal. It's pretty small. Did you turn on `spark.memory.offHeap.enabled`? No, I didn't set `spark.memory.offHeap.enabled` explicitly, so...

I've just tested it on `Spark 3.3.1` + `Delta Lake 2.1.0` and also set `spark.memory.offHeap.enabled` to `false` and `spark.databricks.delta.snapshotCache.storageLevel` to `DISK_ONLY`, but the issue still exists. Proofs (mostly for myself):...

> @politician Please consider that adding another release type requires official support from our side and would also require a lot of tests, and also updated release scripts. It is...

@rongrong Hi! Could you please comment the PR?