cuvs icon indicating copy to clipboard operation
cuvs copied to clipboard

[BUG] NN Descent iteration terminating early for small dataset and specific data dimension

Open jinsolp opened this issue 1 year ago • 0 comments

Describe the bug For small datasets (n_rows=100) NN Descent breaks out of the iteration loop early (after 1 iteration), and does not properly return update indices. Specifically, this behavior happens when graph_degree=64 (but is not an issue when graph_degree=32). After some investigation, it seems like the following if statement in update_and_sample (in raft/cpp/include/raft/neighbors/detail/nn_descent.cuh) become true immediately after 1 iteration.

if (update_counter_ < build_config_.termination_threshold * nrow_ *
                              build_config_.dataset_dim / counter_interval) {
    update_counter_ = -1;
}

It may be worth looking into the update_graph function (in the same file) and check when whether the update_counter is being incremented properly.

This is not an issue for the current test configuration which tests for large datasets - 1000, 2000 n_rows.

Steps/Code to reproduce bug Add the following prints to raft/cpp/test/neighbors/ann_nn_descent.cuh right before EXPECT_TRUE.

raft::print_host_vector("indices naive after build", indices_naive.data(), 5, std::cout);
raft::print_host_vector("indices nndescent after build", indices_NNDescent.data(), 5, std::cout);
EXPECT_TRUE(eval_recall(
        indices_naive, indices_NNDescent, ps.n_rows, ps.graph_degree, 0.001, min_recall));

Also, change the test parameters like below by adding 100 to n_rows

const std::vector<AnnNNDescentInputs> inputs = raft::util::itertools::product<AnnNNDescentInputs>(
  {100, 1000, 2000},                                              // n_rows
  {3, 5, 7, 8, 17, 64, 128, 137, 192, 256, 512, 619, 1024},  // dim
  {32, 64},                                                  // graph_degree
  {raft::distance::DistanceType::L2Expanded},
  {false, true},
  {0.90});

Run the test.

./cpp/build/gtests/NEIGHBORS_ANN_NN_DESCENT_TEST --gtest_filter=AnnNNDescentTest/AnnNNDescentTestI8_U32*

It can be seen that the indices are not updated properly.

indices naive=[0,21,50,60,87];
indices nndescent=[1,2,3,4,5];   // stays same

For distances (feature to be added by this PR) it also does not get updated and stays as initial FLOAT_MAX values.

dists naive after build=[0,1,13,14,27,35,35,50,56,62];
dists nndescent after build=[3.40282e+38,3.40282e+38,3.40282e+38,3.40282e+38,3.40282e+38,3.40282e+38,3.40282e+38,3.40282e+38,3.40282e+38,3.40282e+38];
indices naive after build=[0,2,49,82,61,9,67,32,13,97];
indices nndescent after build=[1,2,3,4,5,6,7,8,9,10];

Environment details (please complete the following information): build in devcontainers cuda12.2-conda

jinsolp avatar Jun 07 '24 21:06 jinsolp