knn icon indicating copy to clipboard operation
knn copied to clipboard

Label not random when k=1 and multiple nodes within distance

Open TheColorman opened this issue 3 years ago • 1 comments

In a situation where 2 different nodes are the same distance from the test node, the returned label is always the last one.
For example:

import KNN from 'ml-knn';

var train_dataset = [
    [1, 1, 1],
    [1, 1, 1],
    [0, 0, 0],
];
var train_labels = ["A", "B", "C"];
var knn = new KNN(train_dataset, train_labels, { k: 1 });

var test_data = [1, 1, 1];

var ans = knn.predict(test_data);

console.log(ans);
// > B

There are 2 nodes with the same coordinates, $(1, 1, 1)$. The test node has the same coordinates. Expected results would be to randomly choose between node A and node B, but instead, B is always chosen. If the labels A and B er switched, the output will be A every time.

TheColorman avatar Nov 12 '22 18:11 TheColorman

I realise this may be a limitation of k-d trees.

TheColorman avatar Nov 12 '22 18:11 TheColorman

Closing as this is an issue with kd-tree (#32).

TheColorman avatar Dec 23 '22 22:12 TheColorman