Andreas Klöckner
Andreas Klöckner
I can see the reasoning, and I'd likely take the patch if you make one. Not sure I would want that option documented just yet, since it's obviously inherently dangerous.
Just wanted to report in to say that this hasn't fallen off my radar, but due to tenure-related crunch time at work, I'll have to push this out to mid-May...
From my perspective, the idea is that you turn any subexpression you would like to tag into a substitution rule (using, e.g. `extract_subst`). To be fair, no other type of...
I think it could be taught. ``` extract_subst(knl, "sum(i, *)") ```
> Maybe pycuda can do this automatically? It certainly tries to: https://github.com/inducer/pycuda/blob/29466d4e93ec20a81ce2534327aed24903c3a2e2/pycuda/driver.py#L13-L59 Could you investigate what's happening with that code on your system?
I'd be happy to take a patch.
It would be possible to work around this by casting the shape input to the constructor to a tuple, but PyCUDA has historically not accepted that, and I am not...
It's true that numpy's behavior serves as a guideline for what PyCUDA should do. If you throw me a pull request, I'll merge it. :-)
- How did you install pycuda? - Can you run https://github.com/NVIDIA/cuda-samples/tree/master/Samples/matrixMulDrv?
``` import pycuda.driver as cuda a_gpu = cuda.mem_alloc(64) ``` Try adding `import pycuda.autoinit` before trying to allocate memory.