taco Compile error when contract two sparse matrix

When contracting two sparse matrices, the following code does not work: A(i, k) = B(i, j) * C(j, k); Here A, B, C are matrices which are sparse in all dimensions. The IDE shows an error information in function ‘assemble’: error: ‘pA02_begin’ undeclared (first use in this function)

This error depends on the order of the indices, i.e., the following code works for me A(i, k) = B(i, j) * C(k, j); //This works

Similar error also occurs for tensors with more indices.

tacotest.cpp.txt

May 25 '21 04:05 xichuang

I think this is a sparse output bug. The generated compute() function compiles successfully, but assemble() fails, using the variable pA02_begin without defining it.

The bad part of assemble looks like this:

    if (pA2_begin < kA) {
      if (A1_crd_size <= iA) {
        A1_crd = (int32_t*)realloc(A1_crd, sizeof(int32_t) * (A1_crd_size * 2));
        A1_crd_size *= 2;
      }
      A1_crd[iA] = i;
      iA++;
    }

The full generated function can be found here: https://gist.github.com/Infinoid/c64400c1d9e830d139afa41cbc3c708f

If the output format is changed to dense, the generated code compiles correctly.

Jun 04 '21 14:06 Infinoid

In the case of sparse matrix multiplication, you can actually use the precompute scheduling command to insert a workspace, which would handle the case where all matrices are fully sparse:

#include "taco.h"
using namespace taco;
int main() {
    Format csr({Sparse, Sparse});
    Format csf({Sparse, Sparse});
    Format sv({Sparse, Sparse});
    Tensor<double> A({2, 3}, csr);
    Tensor<double> B({2, 3}, csf);
    Tensor<double> C({3, 3}, sv);
    B.insert({0, 0}, 1.0);
    C.insert({0, 0}, 4.0);
    B.pack();
    C.pack();
    IndexVar i, j, k;
    IndexExpr mul = B(i, j) * C(j, k);
    A(i, k) = mul;
    TensorVar w(Type(A.getTensorVar().getType().getDataType(), {A.getTensorVar().getType().getShape().getDimension(1)}), dense);
    IndexStmt stmt = A.getAssignment().concretize().reorder({i, j, k}).precompute(mul, k, k, w);
    A.compile(stmt);
    A.assemble();
    A.compute();
    std::cout << A << std::endl;
}

Note though that the precompute command currently only supports one-dimensional workspaces, so this approach might not work if you have higher-order tensors; @weiya711 is currently working on extending precompute to support arbitrary-order workspaces, which would enable taco to fully support contractions with sparse results. In the near future, we also plan to implement an autoscheduler that would automatically apply these scheduling commands to always produce correct (if not necessarily fully optimal) code.

Jun 04 '21 17:06 stephenchouca

Thanks. Could we emit a better error message in cases where a workspace is necessary?

Jun 04 '21 17:06 Infinoid