Andreas Klöckner
Andreas Klöckner
Tried that: https://github.com/intel/llvm/issues/2038. (Not sure it's the same bug, but still). So far, that hasn't been met with resounding success. But it'd be great to have another viable CL runtime.
@gmagno I think this is a lovely idea. Thanks for working on this! One main question we should address here is which OpenCL version these should target. I'm guessing 2.0...
That's impressive! What would be a reasonable way to make them installable `pip install (something)` without making them the default?
That's not a bad plan. It would certainly make it easier to adopt PyOpenCL as a dependency for computational work. I'm a little worried about overextending ourselves with maintaining installation...
@isuruf, can you get the time down by dialing down the order? (I suspect you tried, asking just for completeness.) @kaushikcfd, Do you have some Firedrake kernels that we could...
@wence- provided some stats [here](https://github.com/inducer/loopy/pull/136#issuecomment-678411180) on a big kernel from Firedrake. (after #136 and gitlab !408). My initial read of the stats: * All of loopy takes about 680s. *...
Give me the one(s) you feel are the least reasonable for their size, so that we can get those dealt with first.
Thanks! That helps quite a bit in breaking things down. Better breakdown: Total time: 348s - of which 200s in scheduling (aka linearization), of which 141s in `check_variable_access_ordered` and 47s...
See also #148.
I sympathize with this. Do you think this should be a global or per-array setting?