distributed icon indicating copy to clipboard operation
distributed copied to clipboard

A distributed task scheduler for Dask

Results 530 distributed issues
Sort by recently updated
recently updated
newest added

### Use case 1 A task in flight fails to unpickle when it lands; this triggers a `GatherDepFailureEvent`. ### Use case 2 A task in flight unpickles successfully when it...

deadlock

In https://github.com/dask/distributed/issues/6110#issuecomment-1105837219, we found that workers were running themselves out of memory to the point where the machines became unresponsive. Because the memory limit in the Nanny is implemented [at...

stability
memory

It would be handy to be able to record multiple measures at once in `MemorySampler`. Particularly, recording `process` and `managed_spilled` at the same time gives you a picture of both...

enhancement
diagnostics

This log message is also issued if the worker closes properly. That is a bit confusing. If this happens in a non-closing state, I consider this an error and promoted...

Note: If I do the below with adapt(minimum=32, maximum=32), it works repeatably with no failures. If I throw ~100 tasks at a AWS EC2Cluster with adapt(minimum=1, maximum=32) enabled. All tasks...

``` Aug 11 10:00:05 ip-10-0-3-173 cloud-init[1264]: Exception in callback Worker._handle_stimulus_from_task(

bug

I don't really understand fully what's going on but we've seen some fatal windows errors on CI with this. This _should_ never happen since the merge actually catches a `RecursionError`...

- Related to #5371 - Blocked by #6577 Currently crashes also in the case without AMM. I believe this is due to having 10 nannies on 2-4 CPUs.

``` Aug 11 09:49:57 ip-10-0-12-62 cloud-init[1268]: 2022-08-11 09:49:57,015 - bokeh.core.property.validation - ERROR - 'start' Aug 11 09:49:57 ip-10-0-12-62 cloud-init[1268]: Traceback (most recent call last): Aug 11 09:49:57 ip-10-0-12-62 cloud-init[1268]: File...

bug