Aditya Puranik

Results 36 comments of Aditya Puranik

@maleadt I've been having some thoughts on this PR, mainly that this does not reflect the true definition of prefix scan. Scan is supposed to work on any binary associative...

No longer requires a neutral element. It now computes exclusive scan which it didn't earlier. Requires more tests for the new features but I don't know what kind of tests...

Fixed, no races detected anymore. The error was in my usage of `@cuDynamicSharedMem`. ```julia # new version julia> A = CUDA.ones(Int128, 100_000_000); julia> B = similar(A); julia> CUDA.@time CUDA.scan!(+, B,...

The reason for the poor performance turned out to be missing `@inbounds`. Now the performance is closer to belloch albeit about 7% slower. ``` #current master julia> @benchmark CUDA.@sync(CUDA.scan!(+, B,...

Did an oopsie and removed the inbounds from `aggregate_partial_scan` function. Now it's performing much better than the current version for both cases. ``` julia> @benchmark CUDA.@sync(CUDA.scan!(+, B, A; dims =...

No, I tested on this branch again to be sure and all tests have passed successfully. My tests: [https://gist.github.com/Ellipse0934/2db29bc0602641c4cccd58aadb64fd93](https://gist.github.com/Ellipse0934/2db29bc0602641c4cccd58aadb64fd93)

For the original issue you can use `Adapt.jl` as shown in [the docs](https://cuda.juliagpu.org/stable/tutorials/custom_structs/). ```julia Adapt.@adapt_structure UnitRange{Int} ``` But the main issue is that you are essentially doing a mapreduce inside...

Would something like this work ? ```julia # https://github.com/JuliaLang/julia/blob/dacd16f068fb27719b31effbe8929952ee2d5b32/stdlib/InteractiveUtils/src/codeview.jl const llstyle = Dict{Symbol, Tuple{Bool, Union{Symbol, Int}}}( :default => (false, :normal), # e.g. comma, equal sign, unknown token :comment => (false,...

Yes, I missed the tokeniser failures. I have added some additional regex to identify the type of token. Also, some colour changes have been reverted but we will fix that...

I wrote some implementations and got confused as to whether we want to be minimalist here or do something closer to an actual parser with proper tokens. The Julia replacement...