[BUG] Convolution needs more unit tests appears broken. 2D needs to support arbitrary batching/ranks.
Convolution needs more unit tests for both 1D and 2D.
There are things that look like they are probably bugs but are not currently covered by unit tests.
For example:
throughout the code the x dim is the last dim (fastest changing):
if constexpr (d_in.Rank() == 4) { bdims[0] = blockIdx.z / d_in.Size(1); bdims[1] = blockIdx.z - (bdims[0] * d_in.Size(1)); ix_size = d_in.Size(3); iy_size = d_in.Size(2); } if constexpr (d_in.Rank() == 3) { ix_size = d_in.Size(2); iy_size = d_in.Size(1); } else if constexpr (d_in.Rank() == 2) { ix_size = d_in.Size(1); iy_size = d_in.Size(0); }
But then later when checking bounds we do it in reverse:
if ((threadIdx.x < d_filter.Size(1)) && (threadIdx.y < d_filter.Size(0))) { s_filter[d_filter.Size(1) * threadIdx.y + threadIdx.x] = d_filter(threadIdx.y, threadIdx.x); }
for (int x = 0; x < d_filter.Size(1); x++) {
if ((tid_x - static_cast
for (int y = 0; y < d_filter.Size(0); y++) {
if ((tid_y - static_cast<int>(d_filter.Size(0)) + 1 + y < 0) ||
(tid_y - static_cast<int>(d_filter.Size(0)) + 1 + y >= iy_size)) {
continue;
}
We should also take this as an opportunity to allow batching with > rank4 tensors and batch the filters too.