AutoComplete Unstable compareTo between FunctionCompletion and AbstractCompletion

Hello, i think there's a bug in the compareTo method of FunctionCompletion-class. Very seldom, i'm getting an exception when sorting my completions:

        java.lang.IllegalArgumentException: Comparison method violates its general contract!
	at java.util.ComparableTimSort.mergeHi(ComparableTimSort.java:866)

The problem is that FunctionCompletion.compareTo uses String.compareTo but AbstractCompletion uses String.compareToIgnoreCase.

I have reduced the problem to following code. I compare function "O" with function "f" and basic string "h". O < f according to FunctionCompletion (ascii compare), and f < h according to AbstractCompletion, but O > h according to AbstractCompletion (case insensitive). This breaks the sort contract.

My suggestion is to change the compareTo in FunctionCompletion to use compareToIgnoreCase.

package com.sigge.completion;

import org.fife.ui.autocomplete.BasicCompletion;
import org.fife.ui.autocomplete.FunctionCompletion;
import org.junit.Test;

public class CompletionCompareTest
{
    @Test
    public void test_compare()
    {
        FunctionCompletion O = new FunctionCompletion(null, "O", "String");
        FunctionCompletion f = new FunctionCompletion(null, "f", "string");
        BasicCompletion h = new BasicCompletion(null, "h");

        int compareTo = O.compareTo(f);
        int compareTo23 = f.compareTo(h);
        int compareTo13 = O.compareTo(h);
        if (compareTo < 0 && compareTo23 < 0 && compareTo13 > 0)
        {
            throw new RuntimeException("Unstable completion between: " + O + ", " + f + ", " + h);
        }
    }
}

Feb 04 '23 19:02 siggemannen

Any comments regarding this? :) I can perhaps do a PR instead

Jul 18 '24 09:07 siggemannen

I filed a defect like this before. The solution was to implement "unique case-insensitive ordering" in the comparison method. I offered the fix, but it was not accepted.

/**
 * Compare two strings in case insensitive way if the comparison does not
 * result in equality.  Otherwise, case sensitive comparison is performed.
 * This is to preserve unique sorting order of all string pairs, such
 * as between "us" and "US".
 *
 * A common usage is to pick this method as the comparator of a TreeMap object.
 *
 * Codepoints outside the BMP are supported.
 *
 * This method does not take into consideration the current locale.
 *
 * @param c1 the first string
 * @param c2 the second string
 */
public static int cmpUniqueIgnoreCase(final CharSequence c1, final CharSequence c2)
{
    int result = StrUtil.cmpIgnoreCase(c1, c2);
    // Return the case insensitive order if they are not equal.
    // "CAT" > "BAT"
    // "BAT" < "CAT"
    // "US" is equal to "us" case insensitively, but we need to
    // prevent an ordering that allowed "us, US" to inter-mix
    // with "US, us".
    if (result == 0)
    {
        // Compute the standard ordering between two strings.
        result = StrUtil.cmpWithCodePoint(c1, c2);
    }
    return result;
}

/**
 * Compare two strings in order of their code-points.
 * Codepoints outside the BMP are supported.
 *
 * {@code String.compareTo()} may return unexpected results when
 * the arguments contain characters outside the Basic Multilingual Plane.
 *
 * This method does not take into consideration the current locale.
 *
 * @param text1 The first string
 * @param text2 The second string
 */
public static int cmpWithCodePoint(final CharSequence text1, final CharSequence text2)
{
    int result = 0;
    if (text1 != text2)
    {
        final int text1Len = text1.length();
        final int text2Len = text2.length();
        int text1Ptr = 0;
        int text2Ptr = 0;
        for (;;)
        {
            if (text1Ptr >= text1Len)
            {
                // both ran out : 0
                // only text1 ran out : -1
                result = (text2Ptr >= text2Len) ? 0: -1;
                break;
            }
            if (text2Ptr >= text2Len)
            {
                result = 1;
                break; // text 2 ran out; test1 did not ran out
            }
            final int s1 = Character.codePointAt(text1, text1Ptr);
            final int t1 = Character.codePointAt(text2, text2Ptr);
            result = s1 - t1;
            if (result != 0)
            {
                break;
            }
            text1Ptr += Character.charCount(s1);
            text2Ptr += Character.charCount(t1);
        }
    }
    return result;
}

/**
 * Compare two strings in case-insensitive code-point comparison.
 * Codepoints outside the BMP are supported.
 *
 * {@code String.compareTo()} may return unexpected results when
 * the arguments contain characters outside the Basic Multilingual Plane.
 *
 * This method does not take into consideration the current locale.
 *
 * @param text1 The first string
 * @param text2 The second string
 */
public static int cmpIgnoreCase(final CharSequence text1, final CharSequence text2)
{
    int result = 0;
    if (text1 != text2)
    {
        final int text1Len = text1.length();
        final int text2Len = text2.length();
        int text1Ptr = 0;
        int text2Ptr = 0;
        for (;;)
        {
            if (text1Ptr >= text1Len)
            {
                // both ran out : 0
                // only text1 ran out : -1
                result = (text2Ptr >= text2Len) ? 0: -1;
                break;
            }
            if (text2Ptr >= text2Len)
            {
                result = 1;
                break; // text 2 ran out; test1 did not ran out
            }
            final int s1 = Character.codePointAt(text1, text1Ptr);
            final int t1 = Character.codePointAt(text2, text2Ptr);
            if (s1 != t1)
            {
                final int s2 = Character.toUpperCase(s1);
                final int t2 = Character.toUpperCase(t1);
                if (s2 != t2)
                {
                    result = Character.toLowerCase(s2) -
                        Character.toLowerCase(t2);
                    if (result != 0)
                    {
                        break;
                    }
                }
            }
            text1Ptr += Character.charCount(s1);
            text2Ptr += Character.charCount(t1);
        }
    }
    return result;
}

Mar 29 '25 04:03 tttwang23

Now this is the prefix text filtering code for a TreeMap sorted with unique case-insensitive ordering. These source code were released under the Apache License Version 2.0 as part of Project Cash Flow on SourceForge.

/**
 * Return all names from a TreeMap key/value mapping whose key string
 * starts case insensitively with the <i>prefix</i> string.
 *
 * An important requirement is for the TreeMap to be sorted
 * using case insensitive comparison with tie-breaker being decided
 * using case sensitive comparison. (Implemented by NameOrderComparator)
 *
 * @param map A TreeMap object
 * @param prefix A prefix string for starts with case insensitive matching
 *
 * @return An ArrayList of matched strings
 */
public static <T> ArrayList<String> filterTreeMapNames(
    TreeMap<String,T> map, String prefix)
{
    // Unique case insensitive order for a single alphabetic letter
    // has the following ordering:
    // AaBbCcDdEeFfGgHhIiJjKkLlMmNnOoPpQqRrSsTtUuVvWwXxYyZz
    //
    // Suppose the prefix is letter 'f', then a head map of decending
    // iterator would return potentially 'F' (case-insensitive match),
    // then 'e' (mismatch).
    // We could stop at 'e' because cursor has already gone out of range.
    // Similarly, a tail map would return
    // potentially 'f' (case-insensitive match), then 'G' (mismatch).
    // We could stop at 'G' because cursor has already gone out of range.
    //
    ArrayList<String> filteredBox = new ArrayList<String>();
    if (! map.isEmpty())
    {
        {
            // Look at the keys less than the target key.
            NavigableMap<String,T> headMap = map.headMap(prefix, false);
            // Search for prefix in descending direction.
            Iterator<String> it = headMap.navigableKeySet().descendingIterator();
            while (it.hasNext())
            {
                String name = it.next();
                // Search for prefix.
                if (StrUtil.startsWithIgnoreCase(name, prefix))
                {
                    // completion hit
                    filteredBox.add(name);
                }
                else
                {
                    // value too low - beyond the test range
                    break;
                }
            }
        }
        // Reverse the descending added keys.
        Collections.reverse(filteredBox);
        {
            // Look at keys higher or equal to the target key.
            SortedMap<String,T> tailMap = map.tailMap(prefix);
            for (String name : tailMap.keySet())
            {
                if (StrUtil.startsWithIgnoreCase(name, prefix))
                {
                    filteredBox.add(name);
                }
                else
                {
                    // value too high - beyond the test range
                    break;
                }
            }
        }
    }
    return filteredBox;
}

/**
 * Returns whether <i>str</i> starts with <i>prefix</i>,
 * ignoring case.
 *
 * Codepoints outside the BMP are supported.
 *
 * This method does not take into consideration the current locale.
 *
 * @param str The string to check.
 * @param prefix the specified prefix
 * @return Whether <i>str</i> starts with <i>prefix</i>, ignoring case.
 */
public static boolean startsWithIgnoreCase(
     final CharSequence str,  final CharSequence prefix)
{
    final int prefixLen = prefix.length();
    final int strLen = str.length();
    boolean result = false;
    if (strLen >= prefixLen)
    {
        int prefixPtr = 0;
        int strPtr = 0;
        result = true;
        while (prefixPtr < prefixLen)
        {
            if (strPtr >= strLen)
            {
                result = false; // str ran out
                break;
            }
            final int s1 = Character.codePointAt(prefix, prefixPtr);
            final int t1 = Character.codePointAt(str, strPtr);
            if (s1 != t1)
            {
                final int s2 = Character.toUpperCase(s1);
                final int t2 = Character.toUpperCase(t1);
                if (s2 != t2)
                {
                    if (Character.toLowerCase(s2) !=
                        Character.toLowerCase(t2))
                    {
                        result = false; // mismatch
                        break;
                    }
                }
            }
            prefixPtr += Character.charCount(s1);
            strPtr += Character.charCount(t1);
        }
    }
    return result;
}

Mar 29 '25 05:03 tttwang23

Yeah, that might be an idea. Although, i suspect it will still lead to errors because of unstable sorting properties, for example us vs US will be sorted as less in one case and equal in another. Also, perhaps one does want case insensitive functions, for example in languages like C or java, so maybe my solution was too simplistic. @bobbylight how about using Legacy sort if one has this problem? Right now my solution is to reimplement the functioncompletition in a patch which doesn't sound too nice. Perhaps a flag on completitionProvider?

Mar 29 '25 08:03 siggemannen

Unique case-insensitive comparison is almost the same as case-insensitive comparison in user interface design. At least I cannot see anything strange in autocompleting fields in Cash Flow. "US" is always less than "us" in unique case-insensitivity, but equal in traditional case-insensitive comparison. Mathematically speaking if x == y, then f(x) must be identical to f(y). Obviously f("us") could be different than f("US"). Thus you have to use unique case-insensitive comparison in a SortedMap just to avoid internal inconsistencies or constraint violations.

Mar 29 '25 17:03 tttwang23