Sam Gross
Sam Gross
I mean a tensor where `pImpl` is `NULL`. (i.e. `tensor.defined()` returns `false`) This code will segfault: ``` Tensor matrix = CPU(kFloat).ones({1, 1, 1}); matrix.btrifact() ```
We support GCC, clang, and MSVC. We can't rely on LTO.
> Would adding sigmoid() to SLEEF itself help your specific case? Or do you have many functions you would like to add? There's many other functions. (Some are listed [here](https://pytorch.org/docs/stable/nn.html#non-linear-activations-weighted-sum-nonlinearity))....
@shibatch this may be useful for other people, but it doesn't solve the problem I wrote about. The problem is the call to Sleef_expf8_u10avx2. MKL-style APIs do not compose well...
VML-style functions are not a good solution for us because they don't compose well. We have potentially many functions and (their derivatives) that we may want to vectorize. `sigmoid` was...
Are those examples intended to show how you would write the functions as part of the Sleef library? Or as a user of the Sleef library? I'm not interested in...
> Is the cost of a function call really a problem? I think inlining the function does not speed up the execution so much. In my experience, the non-inlineable call...
@shibatch Would you be willing to accept a patch that moves SIMD functions (sleefsimdsp, sleefsimddp) into a header file? I have a tried this strategy out here: https://github.com/colesbury/sleef/tree/sleef_header. The header...
@shibatch @fpetrogalli thoughts?
Hi @Rajendra1308, thanks for your interest but I'm already working on this.