smali2java icon indicating copy to clipboard operation
smali2java copied to clipboard

Enhancement - use antlr to build out parser from grammar / lexar file

Open 8secz-johndpope opened this issue 6 years ago • 3 comments

to stabilise parser -

I suggest rebuilding some of the code to leverage the antlr grammar / g4 files here https://github.com/psygate/smali-antlr4-grammar

If you download this wget https://www.antlr.org/download/antlr-4.7.2-complete.jar

you can then run

java -cp antlr-4.7.2-complete.jar org.antlr.v4.Tool -Dlanguage=Go -visitor -o gen SmaliLexer.g4
java -cp antlr-4.7.2-complete.jar org.antlr.v4.Tool -Dlanguage=Go -visitor -o gen SmaliParser.g4

this will spit out the following files / code Screen Shot 2019-06-12 at 11 27 49 pm

https://gist.github.com/8secz-johndpope/30868ccd59f211f0000b90e6176dead7

you should then be able to walk through the smali file / maybe reducing the out of bounds crashes people (including myself) have been experiencing.

For illustration - I successfully used the grammar files to build out parsers / lexers for hundreds of languages with swift https://github.com/johndpope/ANTLR-Swift-Target https://github.com/johndpope/Antlr-Swift-runtime

I forget the entry point into class / it changes for each grammar

Here is the code for swift to read a java file you can find in the above repo.


  let textFileName = "Test.java"
            
            if let textFilePath = Bundle.main.path(forResource: textFileName, ofType: nil) {
                let lexer =  Java8Lexer(ANTLRFileStream(textFilePath))
                print("lexer:",lexer)
                let tokens =  CommonTokenStream(lexer)
                let parser = try Java8Parser(tokens)
               
                let tree = try parser.compilationUnit()
                print("tree:",tree)
                
                let walker = ParseTreeWalker()
                let java8walker = Java8Walker()
                try walker.walk(java8walker,tree)
                
            } else {
                print("error occur: can not open \(textFileName)")
            }

The psuedo code would be


  let textFilePath = "/path/Test.smali"
            

                let lexer =  NewSmaliLexer(ANTLRFileStream(textFilePath)) //this NewSmaliLexer exists 
                print("lexer:",lexer)
                let tokens =  CommonTokenStream(lexer) /// ?? there should be a method to do this
                let parser = try NewSmaliParser(tokens)
               
                let tree = try parser.compilationUnit() // maybe ToStringTree?
                print("tree:",tree)
                
                let walker = ParseTreeWalker() // Here as the lexer / parser reads - you can hook in to translate stuff. 
                let java8walker = Java8Walker()
                try walker.walk(java8walker,tree)
                

there are other people who have created translation using antlr to do this https://github.com/8secz-johndpope/ObjcGrammar you may need some help - when I have more time I will circle back.

8secz-johndpope avatar Jun 12 '19 13:06 8secz-johndpope

You're right, that approach would be much better, as currently I support only a very limited amount of instructions. Will look into it.

AlexeySoshin avatar Jun 21 '19 15:06 AlexeySoshin

vscode has smali syntax highlighting https://github.com/ViRb3/vscode-smali/tree/master/smali could this help?

if you surface any work in a new feature branch - I'm happy to take a look

8secz-johndpope avatar Oct 02 '19 18:10 8secz-johndpope

@8secz-johndpope Thanks for getting back with this issue :) Took a look at it, but it's actually more confusing, since it's based on regexes. Planning to make another branch for antlr this week, per your suggestions.

AlexeySoshin avatar Oct 07 '19 10:10 AlexeySoshin