martinedgefocus
martinedgefocus
``` import dask.dataframe as dd from datetime import date ddf = dd.from_pandas(pd.DataFrame({'a':[date.today(), date(2022,6,13)]}), npartitions=1) ddf.to_parquet("/tmp/p") ``` This works OK on 2022.5.2 but regression in 2022.6.0 stops it working. pyarrow is...
Note: If I do the below with adapt(minimum=32, maximum=32), it works repeatably with no failures. If I throw ~100 tasks at a AWS EC2Cluster with adapt(minimum=1, maximum=32) enabled. All tasks...
This works from a touch test, will keep working with it here, but interested in any feedback / collaboration. Seems critical to support, as this should be a 4-5x reduction...