dace Unexpected results when having symbolic shaped arrays and negative indices

Describe the bug Got unexpected results when having symbolic shaped arrays and negative indices.

To Reproduce

import numpy as np
import dace

M = dace.symbol('M')
N = dace.symbol('N')

@dace.program
def test(q:dace.float64[M, N]):
    for i in range(0,M):
        q[i, 0] = q[-i-1, 1]

test(np.zeros((5, 5)))
print(q)

Expected behavior The result matrix (especially first column) should be all zeros.

Aug 11 '22 20:08 huanglangwen

Runtime negative indices in numpy arrays are not currently supported: https://spcldace.readthedocs.io/en/latest/frontend/parsing.html#main-limitations

The main argument behind this is that adding runtime checks for every indexed expression will slow down the generated code and disallow many optimizations the compiler can perform.

Aug 12 '22 08:08 tbennun

Is there any bound check in debug mode? Because having size signatures always give me an illusion that out of bound indexing are checked.

Aug 12 '22 10:08 huanglangwen

there is out-of-bounds checking in SDFG validation, but these for loops are only evaluated at runtime, so those are harder to check (not impossible, but we don't do so by default)

Aug 12 '22 11:08 tbennun

Is there any bound check in debug mode? Because having size signatures always give me an illusion that out of bound indexing are checked.

As Tal hinted above, the SDFG syntax does not support negative indices. Any support for negative indices is done at the frontend level by converting to C-compatible indices when possible.

There is a bound check during validation. I guess it doesn't trigger for some reason, but this wouldn't necessarily help you, even if it did. I suppose you would like to have a warning that this expression may not work. Ideally, we would automatically convert all terms to be C-compatible, e.g., M-i-1, but this is hard with our current symbolic infrastructure. Consider the expression -i+2, which will take both positive and negative values. Therefore, you need a much more complicated expression for C, which will probably cause validation and performance issues. I will check if we can throw a warning, but this might also be a hit-and-miss due to our symbolic infrastructure.

Aug 12 '22 11:08 alexnick83

We could insert runtime bound check assertion to every indexing node when debug flag is set. Or bound check indices whenever the sdfg is called with new values of symbols.

Aug 12 '22 12:08 huanglangwen

We could insert runtime bound check assertion to every indexing node when debug flag is set. Or bound check indices whenever the sdfg is called with new values of symbols.

It sounds pretty interesting. Would you like to give it a try and make a PR for it?

Aug 12 '22 12:08 alexnick83

I can give it a try. Can I have some hints where to get started?

Aug 12 '22 13:08 huanglangwen

I think it would be a good idea to start with something simple (like the code above) on CPU. Depending on how the final SDFG looks like, you should check the following methods of the CPU CodeGenerator:

For direct copies among data (access nodes): copy_memory
For assignments or copies through a tasklet: _generate_Tasklet

You could put debug points near the beginning of those methods and follow them to see how to code is generated. Then you can try inserting your code that checks the access bounds.

I am not sure if this can be of any help, but if you are interested in the general flow of code generation, i.e., the call tree, I had written an initial draft in the past that may be a bit outdated now, but close enough to figure out the basics:

Aug 12 '22 14:08 alexnick83