Wenjun Si
Wenjun Si
## What do these changes do? Make subtask submission in batch. ## Related issue number Fixes #3089 ## Check code requirements - [ ] tests added / passed (if needed)...
Implements Kronecker product for tensors, just as `numpy.kron` does. This is required by module `tensorly`.
## What do these changes do? Fix incorrect redirection of `groupby.nunique` into `transform` by applying map-combine-agg paradigm. ## Related issue number Fixes #xxxx ## Check code requirements - [x] tests...
In current implementation `SubtaskExecutionActor.run_subtask` keeps the whole lifecycle of a subtask. This disables the possibility to submit subtasks in batches, which is a major time cost of supervisors. As all...
Currently df.groupby() in Mars only implements ``as_index``. ``sort`` is added but not implemented yet, while ``level`` is not implemented which is also useful. What's more, when ``GroupBy`` object is generated,...
Information to Collect =============== * Execution details of every job * Data read amount (with confidence) * Execution cost (with confidence) * Network transfer amount (with confidence) * Peak memory...
Currently, Mars worker allocates CPUs for tasks via ``DispatchActor``, who allocates CPUs in an exclusive manner, that is, when a CPU is allocated to a task, it is removed from...
**Is your feature request related to a problem? Please describe.** Some operands, such as summation over a long series of chunks, can be started and run partially when some data...
Broadcasting data shared by a number of workers can be added. Strategies like TorrentBroadcast in Apache Spark can be used.
As more and more libraries now deprecate support for Python 2, it is time to replace coroutines in Mars with builtin asyncio supplied by Python 3.