spark icon indicating copy to clipboard operation
spark copied to clipboard

Dispose JvmBridge correctly

Open imback82 opened this issue 7 years ago • 1 comments

Currently, a JvmBridge instance is a static member of SparkEnvironment class. Without forcing the user to call something like SparkEnvironment.JvmBridge.Dispose() in his/her application, there is no clean way to dispose JvmBridge, thus Scala side handles the disconnect gracefully (#121).

One approach to address this issue is to have a ref-counted SparkSession where JvmBridge.Dispose() is called when the last SparkSession object is disposed.

  • Note that the following should be handled:
using (var spark = SparkSession.Builder().GetOrCreate())
{
    // do something
}

// New JvmBridge should be instantiated with the following.
using (var spark = SparkSession.Builder().GetOrCreate())
{
    // do somthing
}

One issue with relying on SparkSession is that there are few classes such as SparkConf and Builder that accesses the JvmBridge directly from SparkEnvironment and these classes do not implement IDisposable (to be consistent with Scala Spark API), so it is harder to enforce cleaning up the JvmBridge if an user does the following

public static void Main(string[] args) {
    var conf = new SparkConf();
    // exits Main without creating SparkSession.
}

cc: @rapoth @stephentoub

imback82 avatar Jan 16 '19 00:01 imback82

One possible approach to address this issue is to create a new class that wraps the JvmBridge instance and implements IDisposable. This wrapper class can then be used to manage the lifecycle of the JvmBridge instance and ensure that it is properly disposed of when it is no longer needed.

@rapoth can i work on it?

Pheewww avatar Feb 25 '23 17:02 Pheewww