Thomas Aynaud
Thomas Aynaud
Hello, sparkit-learn is available for all the workers ? It seems that you have it on your master but Spark does not find splearn on the workers. Best
I do not know, the issue appears randomly and I have not reproduced it on my cluster. I have add spark 2.0 to CI in #71 but as it is...
According to apache jira, it is still an issue in pyspark 2.0.2
Hello, It is not yet released, you need to install latest master. `pip install git+https://github.com/lensacom/sparkit-learn.git`
It depends on your data, but be carefull, n_estimators is misleading coming from scikit-learn. It will learn n_estimators X number of partitions. This is because this implementation in fact train...
SparkRandomForestClassifier expect a DictRDD with an "X" ArrayRDD and an "y" ArrayRDD. An ArrayRDD is an RDD of numpy array, so you have to build a small job to transform...
Hello, My code was an example, your model really need to fill in my workflow with big SparkPipeline wanting dictionnary as input. You need yo write the code to convert...
Do your model expect to fit on an array of dict ? If not you have to build a rdd containing acceptable input for your model.
Hello, For the moment I have no plan to adapt to directed graphs. Modularity meaning is not very clear for a directed graph and I do not know a "standard"...
Related issue: https://github.com/taynaud/python-louvain/issues/58