Whitespace is discarded in token stream

Open sshailabh opened this issue 4 months ago • 1 comments

Lossless Parsing with Whitespace Preservation

It looks like the lexer discards the whitespace. This should not be discarded and rather sent to other hidden channel. This results in loss of information. Downsides:

Cannot reconstruct the exact original source from the parse tree
Build accurate code formatters
Develop linters that understand whitespace context

https://github.com/jknack/handlebars.java/blob/master/handlebars/src/main/antlr4/com/github/jknack/handlebars/internal/HbsLexer.g4#L376

WS
 : [ \t\r\n] -> skip
 ;

I see this might be inspired from mustache: https://github.com/jknack/handlebars.java/blob/master/handlebars/src/main/java/com/github/jknack/handlebars/internal/MustacheSpec.java However this can be discarded in TemplateBuilder.java

Oct 04 '25 01:10 sshailabh

@jknack Could you please review the enhancement for re-constructing source content without loss? pr: https://github.com/jknack/handlebars.java/pull/1169

Thank you

Oct 05 '25 18:10 sshailabh