ThreadTools.jl icon indicating copy to clipboard operation
ThreadTools.jl copied to clipboard

Feature request: one thread per block

Open pevnak opened this issue 6 years ago • 1 comments

Dear BaggePinnen,

in https://discourse.julialang.org/t/regular-expression-and-threads/31415/4 I was testing a multi-threading performance of Regexp library. I have tried tmap, but since it executes one thread per item, it is kind of wasteful. I have therefore divided the work to blocks as

function getrange(n, tid = Threads.threadid(), nt = Threads.nthreads())
    d , r = divrem(n, nt)
    from = (tid - 1) * d + min(r, tid - 1) + 1
    to = from + d - 1 + (tid ≤ r ? 1 : 0)
    from:to
end

reduce(vcat, tmap(i -> map(f, items), [getrange(length(items), i) for i in 1:Threads.nthreads()]));

and the improvement in terms of time was massive. What do you think about adding this into your library?

Something along lines

function tmap(f, list; blockmode = false)
	if blockmode
		blocks = [getrange(length(urls), i) for i in 1:Threads.nthreads()]
		return(reduce(vcat, tmap(i -> map(is_valid_path, list[i]), blocks; blockmode = false)))
	else
		tasks = map(list) do l
        Threads.@spawn f(l)
	    return(fetch.(tasks))
    end
end

We would need to make it more general for multiple arguments.

pevnak avatar Nov 23 '19 19:11 pevnak

That is probably a good idea, feel free to submit a PR if you'd like 🙂

baggepinnen avatar Nov 24 '19 00:11 baggepinnen