Prism as default CRuby parser

Open kddnewton opened this issue 1 year ago • 0 comments

This is a meta-issue about the steps that need to be taken in order to become the default CRuby parser. This is not to say that it will be the default parser as that decision is out of our hands. Instead, this issue is meant to list the technical things that are blocking it from happening.

Some/most of these things are outside the control of this repository itself and must be completed in ruby/ruby. I'm not going to link to individual issues at the moment, since the work itself is already being tracked in other places. For the most part this is meant for documentation and visibility.

Get CRuby tests and specs passing.
Compiling Primitive - Right now we don't parse Primitive method calls, but we will need to.
Optimizations - We need to ensure the CRuby compiler is using all of the various optimizations that are already present in compile.c. ~~At the moment I believe this mostly means opt_getconstant_path.~~ But as more are introduced (pushtoarray, concattoarray) we need to match those as well.
Performance - I don't think this is going to be an issue, but we should very that we don't lose any performance once the entire pipeline is established (source -> iseqs). Hopefully we can even pick up some performance.
Errors - We need to ensure we reject all invalid source code. All of the issues we have open with the invalid-syntax label are currently tracking these. We also need to reject invalid jumps, which we still need to work on. There could be others, much more investigation is needed in this. Also for error messages, we need to ensure we're passing all of the Ruby specs to ensure we don't accidentally break error messages that people are expecting.
Tests - At the moment our test suite tests against the output from a combination of RubyVM::AbstractSyntaxTree, ripper, and ruby -c. None of these will make sense in a future where prism is the only parser. As such, our tests will need to be updated to ensure they are testing against fixed images as opposed to dynamic checks against the existing Ruby implementation.
Compiling OPT_SUPPORT_JOKE - I know it's a joke, but I still love it. We should support this as the last step.

The following tasks have already been accomplished:

~~error_highlight - We need to determine how we want to support error_highlight. At the moment it is completely tied to the existing AST structure and node_id fields on instructions. We need to either support node_id as well or we need to find a new way to handle this (by embedding the column directly into the iseq in place of node_id maybe?).~~
~~Ractor pragmas - We need to support ractor pragmas. This should ideally be done in the CRuby compiler, which would involve updating parse.y/compile.c as well.~~
~~eval - Right now we do not hook up to eval, but we will need to.~~
~~CLI options - We need to support the command-line options for Ruby (-n, -p, -l, and -a). This should ideally be done in the CRuby compiler, which would involve updating parse.y/compile.c as well.~~
~~Warnings - We need to ensure we have all of the same warnings. To my knowledge, the only one we're currently missing is duplicated hash keys. This is blocked because we need to finish getting number literals parsed so that we can compare within an individual hash. This has the added benefit of removing the need to keep around the source for JRuby/TruffleRuby.~~
~~ripper - If the existing parser is going away, it will not make sense to maintain ripper. This is a much bigger task than it might seem, as an entire ecosystem of tools depend on ripper. There are many ways to approach this. We could provide a translation layer from prism to ripper (this work has been started). We could proactively go out into the ecosystem and transition the various tools over to using prism (this has already been started). I'm sure other solutions exist as well. Probably the biggest consumers of these are irb and rdoc, which have the highest priority.~~
~~RubyVM::AbstractSyntaxTree - If the existing parser is going away, then this experimental module will go away as well. Even though it is marked as experimental and having no compatibility guarantees, we should still try to help people migrate in case they have come to depend on it. More investigation will be needed to determine who this impacts.~~

If all of these problems are solved, then it will make sense to make the case for this to become the default parser.

Feb 01 '24 19:02 kddnewton