Enable stream-ordered callbacks to host functions
The user should be able to call OCCA functions not just OCCA kernels via the OCCA API. Of course, I mean OCCA functions (vs. kernels) defined in the .okl files.
Presumably this could mean addition of a "build function" method similar to the "build kernel" method in the API.
Of course, the return value from the OCCA function should be propagated to application code when the function is called.
We could try to relax the restrictions for individual modes
-
Serial: No need for
@outer/@inner -
OpenMP: No need for
@inner -
CUDA/OpenCL/HIP: Needs
@outer/@inner
In that case maybe @function like you suggested for functions we want to "export" / make JIT-able. Would that satisfy your use-case?
I see what you're saying. I think what you are alluding to is probably too powerful for my needs.
For me, the objective is about keeping more of related logic together in the same .okl file if possible. Frankly, my use case is conveniently computing sizes of device memory allocations for kernel args (return value and work arrays). A number of my kernels take parameters that affect format and extent of returned result data. All I really would like is the ability to invoke a jit-able kind of function in the .okl file that runs on the host synchronously when invoked in application code using the OCCA API, no matter the OCCA mode. And propagate the return value of the function, too. Not anything more than that, really.
So in other words, OCCA has a very nice API that does jit compilation plus hidden dlopen/dlsym when invoking kernels running async on a device. Please extend the API to handle jit compilation plus dlopen/dlsym on functions running sync on the host, no matter what the mode, with code located in the .okl file.
One could call these "host" functions. Their args and variables are all in host address space and all computations they do are on the host.
My original title for this issue was incorrect as it referred to "OCCA function", so I changed it to pertain to "host function". In v0.2, "OCCA function" would mean a function that runs on the device, declared "occaFunction". Sorry about that.