net.akehurst.language Completions may be incorrect or missing and throw ClassCastExceptions in v4.2.1.21

Consider this very simple grammer:

namespace org.example.brackets.agl

grammar Brackets {
	sentence = '[' 'x' ']';
	
	skip WHITESPACE = "\s+";
}

And this Java program that uses it to get completions on some example strings:

import net.akehurst.language.agl.Agl;
import net.akehurst.language.agl.simple.ContextWithScope;
import net.akehurst.language.api.processor.CompletionItem;
import net.akehurst.language.api.processor.LanguageProcessor;
import net.akehurst.language.asm.api.Asm;

import java.io.InputStream;
import java.util.List;
import java.util.Scanner;

import static java.nio.charset.StandardCharsets.UTF_8;

public class BracketsSuggestions {
	private static final String GRAMMAR_STRING = readResource(BracketsSuggestions.class.getResourceAsStream("Brackets.agl"));
	private static final LanguageProcessor<Asm, ContextWithScope<Object, Object>> LANGUAGE_PROCESSOR = Agl.INSTANCE.processorFromStringSimpleJava(
			GRAMMAR_STRING, null, null, null, null, null,
			Agl.INSTANCE.configurationSimple(), null).getProcessor();
	
	public static void main(String[] args) {
		System.out.print("Initial suggestions:          ");
		printSuggestions("", 0);
		System.out.print("Suggestions after '[':        ");
		printSuggestions("[", 1);
		System.out.print("Suggestions after '[ ':       ");
		printSuggestions("[ ", 2);
		System.out.print("Suggestions after '[x':       ");
		printSuggestions("[x", 2);
		System.out.print("Suggestions within '[ ':      ");
		printSuggestions("[ ", 1);
		System.out.print("Suggestions within '[]':      ");
		printSuggestions("[]", 1);
		System.out.print("Suggestions within '[  ]':    ");
		printSuggestions("[  ]", 3);
		System.out.print("Suggestions after '[x]':      ");
		printSuggestions("[x]", 3);
		System.out.print("Suggestions after '[x] ':     ");
		printSuggestions("[x] ", 4);
	}
	
	private static String readResource(InputStream resourceStream) {
		return new Scanner(resourceStream, UTF_8).useDelimiter("\\A").next();
	}
	
	private static void printSuggestions(String str, int position) {
		System.out.println(String.join(", ", getSuggestions(str, position)));
	}
	
	private static List<String> getSuggestions(String str, int position) {
		try {
			return LANGUAGE_PROCESSOR.expectedItemsAt(str, position, LANGUAGE_PROCESSOR.optionsDefault()).getItems().stream().map(CompletionItem::getText)
					.toList();
		} catch (ClassCastException e) {
			e.printStackTrace();
			return List.of();
		}
	}
}

Hopefully this is an appropriate way to initialise the LanguageProcessor; I don't know if I should be providing more parameters or using some other configuration.

This is the kind of output I would expect ("[x]" would be acceptable in place of "[", as would "x]" in place of "x"; indeed, that may even be preferable):

Initial suggestions:          [
Suggestions after '[':        x
Suggestions after '[ ':       x
Suggestions after '[x':       ]
Suggestions within '[ ':      x
Suggestions within '[]':      x
Suggestions within '[  ]':    x
Suggestions after '[x]':      
Suggestions after '[x] ':

But here is the actual output:

Initial suggestions:          [ x ], [
Suggestions after '[':        [ x ], [
Suggestions after '[ ':       x
Suggestions after '[x':       
Suggestions within '[ ':      [ x ], [
Suggestions within '[]':      [ x ], [
Suggestions within '[  ]':    x
Suggestions after '[x]':      
Suggestions after '[x] ':

Note the clearly incorrect suggestion of "[" after an existing "[" and the missing suggestions of "]" and "x". The suggestion of "[ x ]" after a "[" also appears incorrect, although it could be interpreted as suggesting the existing string be replaced by "[ x ]".

The problem may be related to whitespace; whitespace appears to be expected between each literal, although the grammar doesn't require it (parsing works as expected). But removing the line from the grammar that makes whitespace skippable results in almost identical suggestions (still incorrect and lacking).

ClassCastException

Finally, note that I have to catch ClassCastException in the program above, because with another grammar like below it throws this exception (for every attempt to get completions except the last one):

java.lang.ClassCastException: class net.akehurst.language.grammar.asm.SimpleListDefault cannot be cast to class net.akehurst.language.grammar.api.TangibleItem (net.akehurst.language.grammar.asm.SimpleListDefault and net.akehurst.language.grammar.api.TangibleItem are in unnamed module of loader 'app')
	at net.akehurst.language.agl.completionProvider.SpineDefault.expectedNextLeafNonTerminalOrTerminal_delegate$lambda$1(CompletionProviderAbstract.kt:69)
	at kotlin.SynchronizedLazyImpl.getValue(LazyJVM.kt:83)
	at net.akehurst.language.agl.completionProvider.SpineDefault.getExpectedNextLeafNonTerminalOrTerminal(CompletionProviderAbstract.kt:64)
	at net.akehurst.language.agl.simple.CompletionProviderSimple.provide(CompletionProviderSimple.kt:66)
	at net.akehurst.language.agl.processor.LanguageProcessorAbstract.expectedItemsAt(LanguageProcessorAbstract.kt:296)
	at BracketsSuggestions.getSuggestions(BracketsSuggestions.java:50)
	at BracketsSuggestions.printSuggestions(BracketsSuggestions.java:45)
	at BracketsSuggestions.main(BracketsSuggestions.java:21)

Grammar causing the exception:

namespace org.example.bracketscomplex.agl

grammar BracketsComplex {
	sentence = word | bracketed;
	bracketed = '[' word* ']';
	word = NAME;
	
	leaf NAME = NAME_CHAR+;
	leaf NAME_CHAR = "[-.0-9]";
	
	skip WHITESPACE = "\s+";
}

Aug 03 '25 08:08 willplatt

Thank you for the extensive feedback, and clear description of the issue.

There seem to be two issues,

The result of 'expectedItemsAt' is not what you are expecting.
There is a ClassCastException

I will first address (1) consider a simple variation of your first grammar.

namespace org.example.brackets.agl

grammar Brackets {
    sentence = '[' 'uvwxyz' ']';
    
    skip WHITESPACE = "\s+";
}

what would you expect to see as CompletionItems at the end of a sentence '[ uv' ?

The function is designed to give you the completion of the terminal that is partially complete. it provides 'uvwxyz' as a completion item.

Now due to the complexities of the algorithm, at the end of the terminal, i.e. end of sentence '[ uvwxyz' it does not know if the string uvwxyz is a finished terminal or part of something else (perhaps I could improve this, with some additional investigation - but it is non trivial)

thus the completion items you are given, also include the terminal immediately to the left. If there is a skip (whitespace) terminal added, then the algorithm knows it is finished.

Thus,

'[ uv' will give the completion 'uvwxyz'
'[ uvwxyz' will also include the completion 'uvwxyz'
'[ uvwxyz ' will not include 'uvwxyz' due to the whitespace.

You can remove the unexpected-just-completd-terminal by checking backwards in the sentence until a skip terminal is found. At this point you can check if any offered completion is matched from this point forward and remove it.

My DSL Editor will do this (https://github.com/dhakehurst/net.akehurst.language.editor)

I will look into whether it is possible to do this in general and include it by default, however currently, in order to be able to offer the completion for 'partial' terminals, it is included.

Nov 07 '25 16:11 dhakehurst

Though, having just said all that and written some more tests around it (i.e. trying out your second grammar) there is some inconsistency between literals and patterns as terminals. I will do my best to find time to fix the inconsistency.

Nov 07 '25 16:11 dhakehurst

Regarding (2) the ClassCastException, I do not get this in the latest code base. Perhaps it is something I already fixed. which version of AGL are you using?

Nov 07 '25 16:11 dhakehurst

Hi David, thanks for looking into this.

I see my expectations were a bit beyond what you had intended for the suggestions to do yet then. The grammar I am actually using in my software is along the lines of the second grammar I included, not having many fixed keywords (like 'uvwxyz' in your example), so the completion of partially typed literals is unfortunately not that useful to me. But even suggestions for those keywords don't seem to be working for me for whatever reason.

What I could really use are suggestions of what rules the next character may fall under (e.g. word, NAME_CHAR, ']'), and I can translate those into suggestions for the user (e.g. words that have special meanings or that match what the user has typed so far, or simply ']'). Ideally, each suggestion from AGL would also tell me whether it's a continuation of the last character's rule or not (i.e. a different rule or new application of the same rule), but usually I would be able to work this out.

I was getting the ClassCastException with the latest release, v.4.2.1.21.

I'm still getting some use out of the suggestions in my software by accounting for their quirks, although the range and specificity of the suggestions I'm able to offer could be better and occasionally they can be incorrect.

Nov 07 '25 22:11 willplatt

ok, I'm working on a new release (before Xmas with luck) the Exception is no longer an issue, I tested your example. I may be able to add a flag that indicates if it is a 'terminal-completion', good idea.

The rule indication may also be possible, I do something like that already.

Nov 08 '25 08:11 dhakehurst

Sounds great, thanks!

Nov 08 '25 10:11 willplatt

I'll aim for a result along these lines

interface ExpectedAtResult {
    /**
     * The position in the sentence that the completions were requested for.
     */
    val requestedPosition: Int
    
    /**
     * The position in the sentence that the completions are being offered for.
     */
    val offeredPosition: Int
    
    /**
     * true if the offered completions are for completing a terminal.
     */
    val isTerminalCompletion: Boolean
    
    /**
     * The number of characters from the requested position to the offered position.
     */
    val offset: Int

    /**
     * the completion items being offered
     */
    val items: List<CompletionItem>
    
    /**
     * Any issues found whilst trying to provide completions.
     */
    val issues: IssueCollection<LanguageIssue>
}

Nov 10 '25 08:11 dhakehurst

What I could really use are suggestions of what rules the next character may fall under

An indication of this is given with the 'label' text. However, I agree more could be usful. I will try adding a 'detail' property to a CompletionItem that can contain the 'RuleItem' the suggestion is offered for. Will sometimes be null, and may change in future releases. From a RuleItem you should be able to navigate to its owningRule.

Nov 10 '25 09:11 dhakehurst

There are also some ParseOption that might help you get what you want

    /**
     * A regular expression pattern used for identifying the start of a word
     * while looking for the nextExpected Tokens
     * (in the reverse direction, i.e. a sentence position going backwards towards the start of the sentence).
     *
     * This allows for handling of partial-words. The 'next-expected' could be partially written/covered by a position.
     *
     * This property is optional and can be set to null, in which case the parser
     * may resort to a default behavior or other mechanisms for determining word boundaries.
     *
     * The Regex should identify the first character (i.e. not part of a word) after which the search for the next expected tokens should start.
     *
     * Default value: `null`.
     */
    var reverseFindWordStartRegex: String?

    /**
     * Indicates whether the parser should use the skip rules of the grammar to determine a word's start position
     * while looking for the nextExpected Tokens
     * (in the reverse direction, i.e. a sentence position going backwards towards the start of the sentence).
     *
     * Default value: `true`.
     */
    var reverseFindWordStartBySkipRules: Boolean

Nov 10 '25 10:11 dhakehurst

If you're able to provide what rule is being used for the CompletionItem and whether it's a rule already begun then I think that would be both more general and specific than isTerminalCompletion.

Nov 10 '25 14:11 willplatt

provide what rule is being used for the CompletionItem that would be the 'detail' property to a CompletionItem I mention above

Nov 10 '25 14:11 dhakehurst

Okay, and I'd be able to tell if the suggestion is for finishing a rule already started because the offeredPosition would be before the requestedPosition?

Nov 10 '25 14:11 willplatt

    override val isTerminalCompletion: Boolean get() = requestedPosition != offeredPosition
    override val offset: Int get() = requestedPosition - offeredPosition

Nov 10 '25 15:11 dhakehurst

So isTerminalCompletion will be true for the completion of any rule already started? I thought it would just be for simple strings and maybe regular expressions.

Nov 10 '25 20:11 willplatt

Just for terminals that are partially complete. The offset should indicate where the terminal starts. Not for rules

On Mon, 10 Nov 2025 at 21:57, William Platt @.***> wrote:

willplatt left a comment (dhakehurst/net.akehurst.language#57) https://github.com/dhakehurst/net.akehurst.language/issues/57#issuecomment-3513863753

So isTerminalCompletion will be true for the completion of any rule already started? I thought it would just be for simple strings and maybe regular expressions.

— Reply to this email directly, view it on GitHub https://github.com/dhakehurst/net.akehurst.language/issues/57#issuecomment-3513863753, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABK3WZOL2M3AJK4FFD4H5ZD34D32DAVCNFSM6AAAAACC7V5ZSOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTKMJTHA3DGNZVGM . You are receiving this because you commented.Message ID: <dhakehurst/net. @.***>

Nov 11 '25 07:11 dhakehurst

new version is released, let me know if that is closer to what you want It won't be completely.

Nov 20 '25 19:11 dhakehurst

Thanks, that was very quick! I was about to try it now but found the agl-processor-jvm8 dependency I use wasn't updated in any of the Maven repositories yet. Is there a reason for that?

Nov 21 '25 23:11 willplatt

https://repo1.maven.org/maven2/net/akehurst/language/agl-processor-jvm/4.2.2.20/

sorry changed to simply 'agl-processor-jvm'

Nov 23 '25 13:11 dhakehurst

I've tried using that instead, but then I can't locate any of the classes I've been using like Asm, LeafData, and CompletionItem. Is there another dependency I need to use in combination with this?

Nov 23 '25 19:11 willplatt

As I assume you are using maven or gradle or similar, other dependencies should be transitively resolved.

I do regularly tidy up the namespaces (packages), so a package may have changed. I would expect that the IDE you are using should be able to automatically find them.

e.g. those you listed net.akehurst.language.asm.api.Asm net.akehurst.language.sppt.api.LeafData net.akehurst.language.api.processor.CompletionItem

Nov 24 '25 07:11 dhakehurst

It turns out I needed to update IntelliJ a bunch of times so that the Kotlin plugin could update and support this later version of Kotlin.

I've been trying this new release in my application and while it seems like it should be giving me quite a bit more power I've not really been able to translate it to better end user suggestions yet. I still have to add spaces to get suggestions and I'm being offered incorrect suggestions while suggestions I would expect are missing. I'll try to test with a simple grammar like before so I can understand what's going on and demonstrate where I think it could be improved.

Nov 24 '25 23:11 willplatt

So the original Brackets grammar I used as an example now works as expected for the inputs with spaces. The ones without spaces I believe I can make workarounds for, so it's not so bad that it fails those.

ClassCastExceptions also appear to have been fixed, meaning I'm able to get useful output with the second example grammar I used, BracketsComplex. Unfortunately, however, some suggestions are incorrect or missing.

Expected output:

Initial suggestions:          <bracketed>, <word>, [
Suggestions after '[':        <word>, ]
Suggestions after '[ ':       <word>, ]
Suggestions after '[x':       <word>, <word>_continuation, ]
Suggestions after '[x ':       <word>, ]
Suggestions within '[ ':      <word>, ]
Suggestions within '[]':      <word>
Suggestions within '[  ]':    <word>
Suggestions after '[x]':      
Suggestions after '[x] ':

Actual output:

Initial suggestions:          <bracketed>, <word>, [, <NAME>
Suggestions after '[':        
Suggestions after '[ ':       <word>, ], <NAME>
Suggestions after '[x':       
Suggestions after '[x ':       
Suggestions within '[ ':      
Suggestions within '[]':      
Suggestions within '[  ]':    <word>, ], <NAME>
Suggestions after '[x]':      
Suggestions after '[x] ':

The important differences here are that AGL does not suggest any continuation (of the word already started) after '[x', does not suggest anything after '[x ', and it suggests ']' within '[ ]', which would turn the valid sentence invalid (and no further insertions could even make it valid). Also, I'm not actually sure how to check if a suggestion returned is a continuation or a new thing, as I didn't see a flag for that.

I made a slightly more complex grammar to demonstrate another bug, where the owning rule of a literal can be incorrect:

namespace org.example.doublebrackets.agl

grammar DoubleBrackets {
	sentence = word | bracketed | '(' sentence ')';
	bracketed = '[' word* ']' | '[' '(' word+ ')' ']';
	word = NAME;
	
	leaf NAME = NAME_CHAR+;
	leaf NAME_CHAR = "[0-9]";
	
	skip WHITESPACE = "\s+";
}

I added a few new test inputs and changed the Java code slightly to output the name of a literal's owning rule.

Expected output:

Initial suggestions:          <bracketed>, <word>, (_sentence, [_bracketed
Suggestions after '[':        <word>, (_bracketed, ]_bracketed
Suggestions after '[ ':       <word>, (_bracketed, ]_bracketed
Suggestions after '[x':       <word>_continuation, <word>, ]_bracketed
Suggestions after '[x ':       <word>, ]_bracketed
Suggestions after '[( ':       <word>
Suggestions after '[(x ':       <word>, )_bracketed
Suggestions after '[(x) ':       ]_bracketed
Suggestions within '[ ':      <word>, (_bracketed, ]_bracketed
Suggestions within '[]':      <word>, (_bracketed
Suggestions within '[  ]':    <word>, (_bracketed
Suggestions after '[x]':      
Suggestions after '[x] ':

Actual output:

Initial suggestions:          <bracketed>, ( <sentence> ), <word>, (_bracketed, [_bracketed, <NAME>
Suggestions after '[':        
Suggestions after '[ ':       (_bracketed, ]_bracketed, <NAME>
Suggestions after '[x':       
Suggestions after '[x ':       
Suggestions after '[( ':       <NAME>
Suggestions after '[(x ':       
Suggestions after '[(x) ':       
Suggestions within '[ ':      
Suggestions within '[]':      
Suggestions within '[  ]':    (_bracketed, ]_bracketed, <NAME>
Suggestions after '[x]':      
Suggestions after '[x] ':

Notice that the initial suggestion of (_bracketed is incorrect, as such a parenthesis would be the one specified in the sentence rule, not the one specified in the bracketed rule. We also have similar incorrect and missing suggestions to the previous example. What is perhaps most troubling to me is the lack of suggestions for any of the inputs containing 'x'. It appears as though suggestions are not forthcoming when the input is more complex, or perhaps it is related to the +/* quantifiers used in the grammar.

Nov 25 '25 20:11 willplatt

Hi, I really appreciate your examples and feedback. Many thanks. The code completion is a fairly new and underused part of the library, so especially thanks for the feedback on that part. I will use your examples for further testing and improvements. (I did try to explain about the 'need a space' issue previously [https://github.com/dhakehurst/net.akehurst.language/issues/57#issuecomment-3503509205], but I will work on improving it.)

Nov 26 '25 08:11 dhakehurst

I thought it might be worth giving an update: I have been able to get a lot better suggestions with v4.2.2.20, but it took a lot of workarounds for the issues mentioned above. Updates to AGL (like fixing the suggestion bugs) or my grammar are likely to mess up my suggestions, so others ought to consider that before trying to implement similar workarounds.

Dec 20 '25 15:12 willplatt