Luca Canali comments

Results 37 comments of


                                            Luca Canali

[SPARK-38098][PYTHON] Add support for ArrayType of nested StructType to arrow-based conversion

I must say I am bit puzzled by the error found in test_pandas_array_struct as I cannot reproduce it in my test system. When I run `python/run-tests --modules pyspark-sql --testnames pyspark.sql.tests.test_pandas_udf_scalar`...

[SPARK-38098][PYTHON] Add support for ArrayType of nested StructType to arrow-based conversion

This should be good to go now, @HyukjinKwon ?

Memory usage

This is fixed in sparkMeasure v0.21 which instroduced executor metrics collection and the reports: ``` (scala)> stageMetrics.printMemoryReport (python)> stagemetrics.print_memory_report() ```

[SPARK-34265][PYTHON][SQL] Instrument Python UDFs using SQL metrics

@HyukjinKwon thanks for the review. Indeed I agree this needs to be checked on the "SQL side" too. I have just pushed a small extension to address the case of...

[SPARK-34265][PYTHON][SQL] Instrument Python UDFs using SQL metrics

@HyukjinKwon I see that a particular query in SQLQueryTestSuite.udf/postgreSQL/udf-aggregates_part3.sql seems to have a problem with this PR. I am struggling to understand why. It looks to be related to the...

[SPARK-34265][PYTHON][SQL] Instrument Python UDFs using SQL metrics

I understand from @HyukjinKwon comment on January 18 that there should be more people expert in Spark's use of Python and SQL to review this. @cloud-fan, @maryannxue, @viirya @ueshin @BryanCutler...

[SPARK-34265][PYTHON][SQL] Instrument Python UDFs using SQL metrics

The issue with SQLQueryTestSuite.udf/postgreSQL/udf-aggregates_part3.sql should be fixed now. I have also extended the instrumentation to applyInPandasWithState recently introduced in SPARK-40434

[SPARK-34265][PYTHON][SQL] Instrument Python UDFs using SQL metrics

Thank you @cloud-fan !

can't find spark-measure 0.21

I confirm that this is an annoying issue, somehow the pom file did not get to maven repos for version 0.21. There does not seem to be a foundamental reason...

can't find spark-measure 0.21

This is now fixed in sparMeasure v0.22 See: https://repo1.maven.org/maven2/ch/cern/sparkmeasure/spark-measure_2.12/0.22/spark-measure_2.12-0.22.pom