SpacemanDMM icon indicating copy to clipboard operation
SpacemanDMM copied to clipboard

Parser chokes on paths containing "operator"

Open Exxion opened this issue 2 years ago • 3 comments

It is perhaps unwise to include operator in a path, but BYOND accepts it.

/datum/operator

/datum/operator/proc/test()

This emits a warning that the proc is relatively-pathed, and confuses the language server into thinking the proc is global. If you leave out the first line, it also emits a "bad parent type" error.

/datum/test/proc/test(datum/operator/thing)

This causes a fatal error in the parser.

It would not be unreasonable to just treat operator as a reserved word and disallow its use in paths. OpenDream does this by default. It shouldn't break the parser, though.

Exxion avatar Oct 22 '23 18:10 Exxion

Unfortunately it's difficult to adapt the parser to allow both of the following to work:

/datum/operator/()
    // this is the overload for `/` division operator for `/datum`
/datum/operator/unrelated_proc()
    // this is an unrelated proc on `/datum/operator`

So I may leave this as a won't-fix.

SpaceManiac avatar Oct 25 '23 05:10 SpaceManiac

Unfortunately it's difficult to adapt the parser to allow both of the following to work:

/datum/operator/()
    // this is the overload for `/` division operator for `/datum`
/datum/operator/unrelated_proc()
    // this is an unrelated proc on `/datum/operator`

So I may leave this as a won't-fix.

Typepaths containing operator within the path work fine* without any issues? It's only when it's present within a named argument's typepath.

image

Recreation Code

/operator
	binary
		proc/add(input)
		proc/sub(operator/input)

Other issues with operator paths

What's interesting is that defining absolute paths with it complains about a bad parent, which also breaks the type tree 02-Tuesday-Code_7855 02-Tuesday-Code_7850

Having an absolute path with operator in it anywhere in your project will create a dud type in the tree. Though if you noticed with my first example it was fine when I did relative pathing 02-Tuesday-Code_7857 02-Tuesday-Code_7856

Zandario avatar Jul 02 '24 20:07 Zandario

Adding context for how BYOND's parser works: There are different modes the parser can be in, one of which is reading path tokens. A path token is either of the / or : operators or a name token (anything that would work as a var name).

When the operator keyword is read as a path token, the lexer immediately looks at the next character following it, and depending on what else comes next, the characters after that. In some cases if the first character after operator might be part of an overloadable operator but turns out not to be, the parser will rewind the input stream back to just after the keyword and the token is just operator. Otherwise it keeps going to get a complete operator and adds that to the end of the token, to create a token such as operator*=.

As an example of a normal case without rewinding: If + follows operator then that's either + or ++ or +=, so the next character is checked to see if it's = or the same as the first. This also means - and * (until I ever get around to adding **=) follow the same logic. The most complicated cases start with < since it covers a lot of operators, including <=> coming in 516.

Rewinding after operator is only done for a few specific situations:

  • When the [ character is encountered without a matching ], it rewinds. (If there is a matching ] it checks if there's a following = to tell operator[] and operator[]= apart.)
  • When : is encountered without a = after it, it rewinds, since := is overloadable but : is a path operator and should be parsed separately.
  • For the / character, if the following character is another / or a * it rewinds because both of those indicate a comment. An underscore or alphabetical character following the slash, which would be the start of a new name, also rewinds because that indicates path parsing is still ongoing. Otherwise it's considered a valid operator match and just checks if = comes next to differentiate between operator/ and operator/=.

LummoxJR avatar Jul 03 '24 16:07 LummoxJR

I appreciate the in-depth explanation. SpacemanDMM's parser is fairly different to the official DM parser - it's not built to be able to rewind, but it's split into a lexing and parsing phase, so it can make decisions by looking one whole token ahead, rather than just one character. This gap does mean I've had to add special cases for certain things (like the difference between a ? b : c:d and a ? b:c : d). This will have to be another one.

SpacemanDMM's tree path parsing basically alternates between reading identifiers and reading / (or : or .). But it's looking for tokens, not characters, so /= and comments aren't a concern at this stage.

Previously, if it saw operator, it would bail immediately with the expectation that the word should then be followed by one of the operator tokens. But / was one of these, so operator / proc became operator/ followed by a seemingly out-of-place proc word. This was done in the fix to #362 - observing that at the time, operator/ was actually just being parsed as operator followed by a dangling / that gets ignored.

The fix is to not bail immediately. Instead, if the parser is in the state where it's just read a / but there's no identifier following (normally ignored as dangling), then it checks if the final word of the path was operator, and only then turns it into operator/.

SpaceManiac avatar Aug 09 '24 02:08 SpaceManiac