Metal.jl
Metal.jl copied to clipboard
Metal programming in Julia
Probably a known issue by the devs but just for the record: ```julia using Metal, BenchmarkTools N = 10_000_000 a = rand(Float32, N) Ma = MtlArray(a) @btime sum($a) # 757.209...
Hello, Thanks for your work. I have a question: How to change current device to AMD Radeon? ``` julia> using Metal julia> devices() 2-element Vector{MtlDevice}: MtlDevice(Intel(R) UHD Graphics 630) MtlDevice(AMD...
```julia julia> Base.unsafe_convert(Ptr{Float32}, MtlArray(rand(Float32, 5))) ERROR: MethodError: unsafe_convert(::Type{Ptr{Float32}}, ::MtlArray{Float32, 1}) is ambiguous. Candidates: unsafe_convert(::Type{Ptr{T}}, a::AbstractArray{T}) where T @ Base pointer.jl:67 unsafe_convert(::Type{Ptr{S}}, a::AbstractArray{T}) where {S, T} @ Base pointer.jl:66 unsafe_convert(::Type{ st...
Metal.jl currently requires: - Julia 1.8 - macOS 13, providing Metal 3 - a mac with an M1 device If people are interested in working on this, some of these...
I am running Julia Version 1.8.0-rc1 (2022-05-27) on OS X 12.4 with an AMD Radeon Pro 5700 XT GPU. ``` julia> a .+ 1 ┌ Warning: Compilation of MetalLib to...
The following code evaluates the performance of the copy of 2 2D square MTL arrays `a` and `b`. It gives a good performance (GBs: 360 GBs) using the kernel version...
We used to have the ability to call pre-compiled kernels (e.g. by passing Metal source code to the appropriate API functions), see https://github.com/JuliaGPU/Metal.jl/tree/9afb62460f8005db00dd3ea71a278758853b24e9/examples/driver. That got lost when we started relying...
The following IR, reduced from our test suite, fails under MTL_SHADER_VALIDATOR=1 on macOS Ventura: ```llvm ; ModuleID = 'broken.ll' source_filename = "text" target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v16:16:16-v24:32:32-v32:32:32-v48:64:64-v64:64:64-v96:128:128-v128:128:128-v192:256:256-v256:256:256-v512:512:512-v1024:1024:1024-n8:16:32" target triple = "air64-apple-macosx13.0.0"...
MWE: ``` using Metal function kernel(out::AbstractArray{T}) where T i = thread_position_in_threadgroup_1d() temp = MtlThreadGroupArray(T, 1) @inbounds temp[i] = 42 threadgroup_barrier(Metal.MemoryFlagThreadGroup) @inbounds out[] = temp[] return end function main(T=Int16) out =...