Use classloader of current context for kryo
I ttried using Externalizer in spark and it does not work. I get "Unable to find class" errors upon deserializing with kryo. Seems to be a classloader issue. Someone else reported similar issue here: http://mail-archives.apache.org/mod_mbox/spark-user/201406.mbox/%3CCAO1tvKTN27i_DaWKeyCcXehe-+z-KnRH0Uif1CVLsVXJoyqzXg@mail.gmail.com%3E
Now spark itself uses chill. See their KryoSerializer: https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/serializer/KryoSerializer.scala
What they seem to do differently is: def newKryo(): Kryo = { val instantiator = new EmptyScalaKryoInstantiator val kryo = instantiator.newKryo() val classLoader = Thread.currentThread.getContextClassLoader .... kryo.setClassLoader(classLoader) kryo }
So i suspect its that kryo.setClassLoader(Thread.currentThread.getContextClassLoader), that is making things work in spark.
Any reason why we would not want to use the current thread's classloader by default for a new kryo? Or if that is too much a change, then just for Externalizer? What are the downsides? I dont understand much about these d%^*m classloaders, so i might be completely off...
I think using the current threads classloader should probably be the default. Happy to accept a PR on that one.
I ran into the same issue. I am trying to use Externalizer for wrapping an object in my Spark 1.3.0 job so as to make the closure serializable. If the KryoPool could be configured to use Thread.currentThread.getContextClassLoader, that would solve this problem. Is it possible to do it in chill version 0.5.0?
We would happily take a PR to make this the default setting in chill going forward. Though I don't think Externalizer exposes an interface to override it now
@ianoc it does:
https://github.com/twitter/chill/blob/develop/chill-scala/src/main/scala/com/twitter/chill/Externalizer.scala#L91
But we should also default to doing the right thing, I guess, which is using the current thread classloader.
Hi, not sure if this comment is relevant but on Kryo 4.0.1, Chill 0.8.0 and Spark 2.1.0 the executor still can't handle meatlocking with Externalizer when deploying in local mode with spark-submit - I get this class lookup failure on the wrapped domain class. Dropping the meatlock will do. Thanks,
@rvvincell Could you clarify? your using the Externalizer manually in spark and it fails? do you have a stack trace or something else. -- though i think you probably want a new issue here. Chill has been using the current thread classloader (this issue) since before 0.8.0