Results 17 issues of Edwin Chan

Overview : [Spark Development Strategy](https://github.com/pandas-profiling/pandas-profiling/wiki/Spark-Development-Plan) Branch : spark-branch Context : We need to test spark branch on production grade clusters with high amounts of data to understand the performance metrics...

performance 🚀
help wanted 🙋

Correlation test fixes, as well as one whitespace fix

Overview : [Spark Development Strategy](https://github.com/pandas-profiling/pandas-profiling/wiki/Spark-Development-Plan) Branch : spark-branch Feature : Three types of correlations - Cramer's V, Kendall's correlations and Phi-K are implemented in pandas-profiling, but not in spark-profiling. We...

help wanted 🙋
Hacktoberfest :fireworks:
spark-enhancement :chart_with_upwards_trend:

This feature creates a GlobalPersistHandler that handles all persist and unpersistence within spark-profiling so that it is easy to clean up after profiling and we can persist objects we want...

Overview : [Spark Development Strategy](https://github.com/pandas-profiling/pandas-profiling/wiki/Spark-Development-Plan) Branch : spark-branch Context : Spearman correlations are a key part of pandas-profiling, and help elucidate rank based correlation statistics. Problem : Spark docs mentions...

getting started ☝
help wanted 🙋
spark-enhancement :chart_with_upwards_trend:

Branch : spark-branch Context : Spark tests should run successfully on github Problem : Tests on spark-branch are currently failing with https://github.com/pandas-profiling/pandas-profiling/runs/4126235396?check_suite_focus=true

spark-enhancement :chart_with_upwards_trend:

Overview : [Spark Development Strategy](https://github.com/pandas-profiling/pandas-profiling/wiki/Spark-Development-Plan) Branch : spark-branch Feature : Profiling pandas dataframes gets you nice [histograms](https://github.com/pandas-profiling/pandas-profiling/blob/abaa9bc44545c874e7a6512f30e5894f6eb127be/src/pandas_profiling/model/summary_algorithms.py#L29) (drawn natively in pandas). We need to do the same, but instead of...

spark-enhancement :chart_with_upwards_trend:

Overview : [Spark Development Strategy](https://github.com/pandas-profiling/pandas-profiling/wiki/Spark-Development-Plan) Branch : spark-branch Feature : Characterizing categorical features better - there are some categorical features that pandas profiling supports that spark profiling does not currently...

spark-enhancement :chart_with_upwards_trend:

Branch : spark-branch Context : Certain values are not computed, but expected by the formatter (see [frequency_table_utils.py](https://github.com/pandas-profiling/pandas-profiling/blob/4366508a6d197e88a90f35bb7749447cc44a2bd9/src/pandas_profiling/report/presentation/frequency_table_utils.py#L27)). Problem : When variables are expected by the formatter but not present, "Warning:...

bug 🐛
code quality 📈