Andreas Klöckner

Results 957 comments of Andreas Klöckner

I can see the reasoning, and I'd likely take the patch if you make one. Not sure I would want that option documented just yet, since it's obviously inherently dangerous.

Just wanted to report in to say that this hasn't fallen off my radar, but due to tenure-related crunch time at work, I'll have to push this out to mid-May...

From my perspective, the idea is that you turn any subexpression you would like to tag into a substitution rule (using, e.g. `extract_subst`). To be fair, no other type of...

I think it could be taught. ``` extract_subst(knl, "sum(i, *)") ```

> Maybe pycuda can do this automatically? It certainly tries to: https://github.com/inducer/pycuda/blob/29466d4e93ec20a81ce2534327aed24903c3a2e2/pycuda/driver.py#L13-L59 Could you investigate what's happening with that code on your system?

I'd be happy to take a patch.

It would be possible to work around this by casting the shape input to the constructor to a tuple, but PyCUDA has historically not accepted that, and I am not...

It's true that numpy's behavior serves as a guideline for what PyCUDA should do. If you throw me a pull request, I'll merge it. :-)

- How did you install pycuda? - Can you run https://github.com/NVIDIA/cuda-samples/tree/master/Samples/matrixMulDrv?

``` import pycuda.driver as cuda a_gpu = cuda.mem_alloc(64) ``` Try adding `import pycuda.autoinit` before trying to allocate memory.