suffixarray icon indicating copy to clipboard operation
suffixarray copied to clipboard

Possible edge case where suffixArray does not give a suffix array.

Open vchoo opened this issue 6 years ago • 0 comments

This was the fastest Javascript implementation of a suffix array construction algorithm I could find. All seemed well - and then, to my surprise, I found this bug (I think it's a bug) as I was doing some testing.

From what I understand of suffix arrays, the array should give me a lexicographical ordering of the suffixes in a string. That is:

function checkSuffixArrayOutput(str) {
    var arr = suffixArray(str);
    for (var i = 1; i < arr.length; i++)
        if (str.substring(arr[i-1]) >= str.substring(arr[i]))
            return false;
    return true;
}

Should always return true for any given string.

However, for the test case string a m.***b***9.4.3**ions require it.***, checkSuffixArrayOutput() returns false.


More specifically, the output I'm getting when I use this suffixArray is that if I run the following code in NodeJS v8.10.0:

> var test = 'a m.***b***9.4.3**ions require it.***';
undefined
> // gets the suffixes of test, which should be in lexicographical order
> var suffixes = suffixArray(test).map(x => test.substring(x));
undefined
> suffixes[5] < suffixes[6]
false
> suffixes[5]
'***b***9.4.3**ions require it.***'
> suffixes[6]
'***'

To anyone fixing this bug, if it helps, for this particular test case, I think the bug occurs during the first call of _suffixArray() in the source code (I believe the single recursive call during the first call gives me the correct lexicographical ordering).

vchoo avatar Apr 21 '19 04:04 vchoo