Snowpark support
It will be nice to have a native Snowflake version of Zingg which can run without Spark. While looking at snowpark, here are the first level thoughts
- Need to figure out the Java/Scala integration as Snowpark is in Scala. should be doable but some changes may need conversions etc
- Snowpark doesnt have Graphframes and MLLib equivalents
- The Dataset API looks similar but there may be gaps
- the pipe abstraction will no longer be needed, as we will assume data is in Snowflake
- The interactive learner needs to be thought through. Where will its interface lie?
- Spark Context etc classes need to change
Our approach could be
- adapter pattern
- parameterization of classes for both types
- new code base(NO!!!)
- some other
I will try and form an opinion on this after checking one flow (findtrainingData?) and see what needs to be done to make it Snowpark compatible. Will jot down findings here.
pom - should we do shimming?
Client.java - JavaSparkContext.jarOfClass(IZinggFactory.class); we should be able to shift it elsewhere FieldDefinition.java - datatype is Spark based. Can be parameterized?? Pipe also refers to DataType, StructTypes. We have corresponding classes in Snowpark client Util.java can be cleansed
Initial java and scala files to test basics of Snowpark MainTest.java.txt Main.scala.txt
Hey @sonalgoyal @navinrathore, looking forward to using Zingg natively on top of Snowflake. Any timelines you have in mind for this feature?
that is great to hear @nipunj15. Sorry we do not have a timeline yet
Snowflake is now natively supported in Zingg enterprise