Apparent OpenCL.read error
Consider the code:
using OpenCL
device, context, queue = OpenCL.create_compute_context()
a = rand(Float32, 125356789)
abuf = OpenCL.Buffer(Float32, context, (:r, :copy), hostbuf=a)
b = OpenCL.read(queue, abuf)
isapprox(a, b)
You can clearly see that upon reading abuf back into host memory, that the last 50 or so coordinates are now zeroed out. The vector a is about half a gigabyte, which is well within both my host and device's memory, what is going on here? Is this something caused by the wrapper or is it a bug in OpenCL itself?
Works for me. What driver are you using and can you reproduce with C?
@yuyichao
By driver version do you mean what version of OpenCL? I'm using OpenCL 1.2 on an Intel HD Graphics 4000 with 1536MB, unfortunately I don't know C well enough to try reproducing in C.
No I mean what driver you are using and on what platform.
@yuyichao
I don't really know what a driver is TBH or how to find it, I typed device[:driver_version] and it said 1.2. I'm on a macbook pro mid-2012 if that helps. Searching online and on my mac system profile and so far haven't found driver info.
@yuyichao
Ok so I think I may have found it, here is the info on the driver I think:
Version: 10.14.73 Last Modified: 4/26/16, 12:39 AM Bundle ID: com.apple.driver.AppleIntelHD4000Graphics Loaded: Yes Get Info String: AppleIntelHD4000Graphics 10.14.73 Obtained from: Apple Kind: Intel Architectures: x86_64 64-Bit (Intel): Yes Location: /System/Library/Extensions/AppleIntelHD4000Graphics.kext Kext Version: 10.1.4 Load Address: 18446743521850200000 Loadable: Yes Dependencies: Satisfied Signed by: Software Signing, Apple Code Signing Certification Authority, Apple Root CA
Has anyone be able to reproduce this yet? It's really frustrating.
Sorry, I don't have a mac available for testing. ~I will see if I can reproduce this with travis.~ Couldn't reproduce it on travis
@esproff Are you using Julia v0.4?
@vchuravy Yes I'm using Julia v0.4.6.
I'm wondering if maybe it has something to do with the max_mem_alloc_size, which for me is 0.4GB. However It's still happening for buffers less than this, but I can't reproduce it every time for buffers < 0.4GB, the way I can for the above example.
I'm curious if you can tell me.
When you allocate an array from host memory to a buffer:
a = rand(Float32, n)
abuf = OpenCL.Buffer(Float32, context, (:rw, :copy), hostbuf=a)
then I see from looking at the memory in Julia with whos(), that the buffer is just a pointer to the array in host memory, so does this buffer allocate to the device right away? or does it wait until you call the buffer from a kernel to transfer to device memory?
I ask this because it seems like you can allocate more buffers than your total device memory, without an error. But then if they're all writing to device memory right away, then they would have to overwrite each other at some point.
Also when you don't allocate to host memory first:
abuf = OpenCL.Buffer(Float32, context, (:rw, :copy), hostbuf=rand(Float32,n))
there is no array allocated to host memory. Then since I can read it correctly, it must be allocating to device memory right away, or there would be no where to store it..
Can you run the following code to see what devices are available for you?
for platform in OpenCL.platforms()
@show platform
@show OpenCL.available_devices(platform)
end
@vchuravy
I ran your code and here is the output:
platform = OpenCL.Platform('Apple' @0x000000007fff0000)
OpenCL.available_devices(platform) = [OpenCL.Device(Intel(R) Core(TM) i5-3210M CPU @ 2.50GHz on Apple @0x00000000ffffffff),OpenCL.Device(HD Graphics 4000 on Apple @0x0000000001024400)]
So basically I have two devices, my CPU and my GPU
Can you run your code on the CPU device? OpenCL implementation are all over the board and I wouldn't be surprised if there is a problem with the combination of Apple + Intel GPU.
On Fri, 5 Aug 2016, 18:44 esproff, [email protected] wrote:
@vchuravy https://github.com/vchuravy
I ran your code and here is the output:
platform = OpenCL.Platform('Apple' @0x000000007fff0000) OpenCL.available_devices(platform) = [OpenCL.Device(Intel(R) Core(TM) i5-3210M CPU @ 2.50GHz on Apple @0x00000000ffffffff),OpenCL.Device(HD Graphics 4000 on Apple @0x0000000001024400)]
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/JuliaGPU/OpenCL.jl/issues/110#issuecomment-237983186, or mute the thread https://github.com/notifications/unsubscribe-auth/AAI3asKKHjtPtV5yIednohPaY9JElg6Tks5qc7zkgaJpZM4JTplk .
@vchuravy
Thanks for getting back to me so quickly, apologies for the delay in getting back to you, it took me a bit to figure out how to set my device to the CPU.
So I set it to the CPU, and ran my code example above, and it works fine on the CPU, no problems.
So you're thinking it might be a compatibility problem between OpenCL, Apple and the Intel GPU?