antlr4 icon indicating copy to clipboard operation
antlr4 copied to clipboard

Golang target @ member variables are global

Open dkfitzpatrick opened this issue 4 years ago • 2 comments

In the Java language target, variables declared in the @lexer::members section get declared within the lexer class. In the golang target the variables get declared as global variables. For example:

// TestLexer.g4 lexer grammar TestLexer;

@lexer::members { var depth int }

OPAR : '(' { depth++ } ; CPAR : ')' { depth-- } ;

It is not clear if this is by design or not (but it would seem not). I am not familiar with the Antlr internals, but it seems the change is only related to the Go.stg template file. From this:

type <lexer.name> struct { *<if(superClass)><superClass>antlr.BaseLexer channelNames []string modeNames []string // TODO: EOF string }

To this (moving the template section for @lexer::members):

type <lexer.name> struct { *<if(superClass)><superClass>antlr.BaseLexer channelNames []string modeNames []string <if(namedActions.members)> <namedActions.members> // TODO: EOF string }

And access the values via the receiver (aka l.depth).

My application was requiring recursive substitution - hence the ability to have lexer-instance specific values is important (same issues apply to generated parsers).

Any pointers to operator error, or work-arounds highly appreciated.

dkfitzpatrick avatar Sep 02 '21 04:09 dkfitzpatrick

Oh no, my code seems not to work at all. Really hope @lexer::members could bring some actual contextual features. 😢

My original response

Thanks to Go's structural interfaces, you could extend your lexer in this way:

// MyLexerState is the custom lexer state.
type MyLexerState struct {
    depth int
}

func (s *MyLexerState) Open() {
    s.depth++
}

func (s *MyLexerState) Close() int {
    d := s.depth
    s.depth--
    return d
}

// StatefulLexer embeds the lexer state. Use this for lexing and parsing.
type StatefulLexer struct {
    *parser.MyLexer
    MyLexerState
}

And in your grammar file, do some structural typing tricks to call the interface:

@lexer::header{

type MyLexerStater interface {
  Open()
  Close() int
}

func stateOpen(l *MyLexer) {
  antlr.Lexer(l).(MyLexerStater).Open()
}

func stateClose(l *MyLexer) int {
  return antlr.Lexer(l).(MyLexerStater).Close()
}

}

Then the lexer could be aided with your custom states via the helpers stateOpen and stateClose now.

anqur avatar May 01 '22 17:05 anqur

Well, here's a dirtier trick, but it works now:

  • Notice that we could add any new methods to the lexer, from the @lexer::header, and the field Interpreter is an interface, so let's extend its implementation
@lexer::header{

// MyLayoutInterpreter extends the original interpreter with a user-defined state, here is `LayoutState`.
type MyLayoutInterpreter struct {
  antlr.ILexerATNSimulator
  LayoutState
}

func NewMyLayoutLexer(input antlr.CharStream) *MyLexer {
  l := NewMyLexer(input)
  // Re-assign the interpreter.
  i := &MyLayoutInterpreter{ILexerATNSimulator: l.Interpreter}
  l.Interpreter = i
  return l
}

// pushLayoutState is the helper that could be used in lexer user actions.
func pushLayoutState(l *MyLexer) {
  l.Interpreter.(interface { Push() }).Push()
}

func popLayoutState(l *MyLexer) int {
  return l.Interpreter.(interface { Pop() int }).Pop()
}

}

The definition of LayoutState is:

type LayoutState struct {
	depth int
}

func (s *LayoutState) Push() {
	s.depth++
}

func (s *LayoutState) Pop() int {
	d := s.depth
	if d > 0 {
		s.depth--
	}
	return d
}

I'm now able to parse a file with some indenation-based grammars. Hope someday adding new user states won't be that tricky and bizarre 😆.

anqur avatar May 02 '22 06:05 anqur